Elementary algorithms

1. Elementary Algorithms Larry LIU Xinyu 1 July 25, 2014 1Larry LIU Xinyu Version: 0.6180339887498949 Email: [email protected]

3. Contents I Preface 5 0.1 Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 0.2 The smallest free ID problem, the power of algorithms . . . . . . 7 0.2.1 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 8 0.2.2 Improvement 2, Divide and Conquer . . . . . . . . . . . . 9 0.2.3 Expressiveness vs. Performance . . . . . . . . . . . . . . . 10 0.3 The number puzzle, power of data structure . . . . . . . . . . . . 12 0.3.1 The brute-force solution . . . . . . . . . . . . . . . . . . . 12 0.3.2 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 12 0.3.3 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . 15 0.4 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 18 0.5 Structure of the contents . . . . . . . . . . . . . . . . . . . . . . . 18 0.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 II Trees 23 1 Binary search tree, the `hello world' data structure 25 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.2 Data Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.4 Traversing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.5 Querying a binary search tree . . . . . . . . . . . . . . . . . . . . 33 1.5.1 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.5.2 Minimum and maximum . . . . . . . . . . . . . . . . . . . 34 1.5.3 Successor and predecessor . . . . . . . . . . . . . . . . . . 34 1.6 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.7 Randomly build binary search tree . . . . . . . . . . . . . . . . . 40 1.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2 The evolution of insertion sort 43 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.3 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.4 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.5 Final improvement by binary search tree . . . . . . . . . . . . . . 49 2.6 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3

4. 4 CONTENTS 3 Red-black tree, not so complex as it was thought 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.1.1 Exploit the binary search tree . . . . . . . . . . . . . . . . 53 3.1.2 How to ensure the balance of the tree . . . . . . . . . . . 54 3.1.3 Tree rotation . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2 De

5. nition of red-black tree . . . . . . . . . . . . . . . . . . . . . 58 3.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5 Imperative red-black tree algorithm ? . . . . . . . . . . . . . . . 71 3.6 More words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4 AVL tree 77 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.1.1 How to measure the balance of a tree? . . . . . . . . . . . 77 4.2 De

6. nition of AVL tree . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.1 Balancing adjustment . . . . . . . . . . . . . . . . . . . . 82 4.3.2 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . 86 4.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.5 Imperative AVL tree algorithm ? . . . . . . . . . . . . . . . . . . 88 4.6 Chapter note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5 Trie and Patricia 95 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Integer Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.2.1 De

7. nition of integer Trie . . . . . . . . . . . . . . . . . . . 96 5.2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3 Integer Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.3.1 De

8. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.3.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.4 Alphabetic Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.4.1 De

9. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.4.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.4.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.5 Alphabetic Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.1 De

10. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.5.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.6 Trie and Patricia applications . . . . . . . . . . . . . . . . . . . . 121 5.6.1 E-dictionary and word auto-completion . . . . . . . . . . 121 5.6.2 T9 input method . . . . . . . . . . . . . . . . . . . . . . . 125 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6 Sux Tree 133 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.2 Sux trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.2.1 Node transfer and sux link . . . . . . . . . . . . . . . . 135 6.2.2 On-line construction . . . . . . . . . . . . . . . . . . . . . 136

11. CONTENTS 5 6.3 Sux Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.3.1 On-line construction . . . . . . . . . . . . . . . . . . . . . 141 6.4 Sux tree applications . . . . . . . . . . . . . . . . . . . . . . . . 150 6.4.1 String/Pattern searching . . . . . . . . . . . . . . . . . . . 150 6.4.2 Find the longest repeated sub-string . . . . . . . . . . . . 152 6.4.3 Find the longest common sub-string . . . . . . . . . . . . 153 6.4.4 Find the longest palindrome . . . . . . . . . . . . . . . . . 155 6.4.5 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 6.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 156 7 B-Trees 159 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 7.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7.2.1 Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7.3 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 7.3.1 Merge before delete method . . . . . . . . . . . . . . . . . 168 7.3.2 Delete and

12. x method . . . . . . . . . . . . . . . . . . . . 176 7.4 Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 7.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 183 III Heaps 187 8 Binary Heaps 189 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 8.2 Implicit binary heap by array . . . . . . . . . . . . . . . . . . . . 189 8.2.1 De

13. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 8.2.2 Heapify . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 8.2.3 Build a heap . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.2.4 Basic heap operations . . . . . . . . . . . . . . . . . . . . 194 8.2.5 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 8.3 Leftist heap and Skew heap, the explicit binary heaps . . . . . . 201 8.3.1 De

14. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 8.3.2 Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 8.3.3 Basic heap operations . . . . . . . . . . . . . . . . . . . . 204 8.3.4 Heap sort by Leftist Heap . . . . . . . . . . . . . . . . . . 206 8.3.5 Skew heaps . . . . . . . . . . . . . . . . . . . . . . . . . . 206 8.4 Splay heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 8.4.1 De

15. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 8.4.2 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 8.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 215 9 From grape to the world cup, the evolution of selection sort 219 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 9.2 Finding the minimum . . . . . . . . . . . . . . . . . . . . . . . . 221 9.2.1 Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 9.2.2 Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 9.2.3 performance of the basic selection sorting . . . . . . . . . 224 9.3 Minor Improvement . . . . . . . . . . . . . . . . . . . . . . . . . 225 9.3.1 Parameterize the comparator . . . . . . . . . . . . . . . . 225

16. 6 CONTENTS 9.3.2 Trivial

17. ne tune . . . . . . . . . . . . . . . . . . . . . . . 226 9.3.3 Cock-tail sort . . . . . . . . . . . . . . . . . . . . . . . . . 227 9.4 Major improvement . . . . . . . . . . . . . . . . . . . . . . . . . 231 9.4.1 Tournament knock out . . . . . . . . . . . . . . . . . . . . 231 9.4.2 Final improvement by using heap sort . . . . . . . . . . . 239 9.5 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 10 Binomial heap, Fibonacci heap, and pairing heap 243 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2 Binomial Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2.1 De

18. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 248 10.3 Fibonacci Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 10.3.1 De

19. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 10.3.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 260 10.3.3 Running time of pop . . . . . . . . . . . . . . . . . . . . . 269 10.3.4 Decreasing key . . . . . . . . . . . . . . . . . . . . . . . . 271 10.3.5 The name of Fibonacci Heap . . . . . . . . . . . . . . . . 273 10.4 Pairing Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 10.4.1 De

20. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 10.4.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 276 10.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 282 IV Queues and Sequences 285 11 Queue, not so simple as it was thought 287 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 11.2 Queue by linked-list and circular buer . . . . . . . . . . . . . . 288 11.2.1 Singly linked-list solution . . . . . . . . . . . . . . . . . . 288 11.2.2 Circular buer solution . . . . . . . . . . . . . . . . . . . 291 11.3 Purely functional solution . . . . . . . . . . . . . . . . . . . . . . 294 11.3.1 Paired-list queue . . . . . . . . . . . . . . . . . . . . . . . 294 11.3.2 Paired-array queue - a symmetric implementation . . . . 296 11.4 A small improvement, Balanced Queue . . . . . . . . . . . . . . . 298 11.5 One more step improvement, Real-time Queue . . . . . . . . . . 300 11.6 Lazy real-time queue . . . . . . . . . . . . . . . . . . . . . . . . . 307 11.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 310 12 Sequences, The last brick 313 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 12.2 Binary random access list . . . . . . . . . . . . . . . . . . . . . . 314 12.2.1 Review of plain-array and list . . . . . . . . . . . . . . . . 314 12.2.2 Represent sequence by trees . . . . . . . . . . . . . . . . . 314 12.2.3 Insertion to the head of the sequence . . . . . . . . . . . . 316 12.3 Numeric representation for binary random access list . . . . . . . 322 12.3.1 Imperative binary access list . . . . . . . . . . . . . . . . 324 12.4 Imperative paired-array list . . . . . . . . . . . . . . . . . . . . . 327 12.4.1 De

21. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 12.4.2 Insertion and appending . . . . . . . . . . . . . . . . . . . 328

22. CONTENTS 7 12.4.3 random access . . . . . . . . . . . . . . . . . . . . . . . . 328 12.4.4 removing and balancing . . . . . . . . . . . . . . . . . . . 329 12.5 Concatenate-able list . . . . . . . . . . . . . . . . . . . . . . . . . 331 12.6 Finger tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 12.6.1 De

23. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 12.6.2 Insert element to the head of sequence . . . . . . . . . . . 337 12.6.3 Remove element from the head of sequence . . . . . . . . 340 12.6.4 Handling the ill-formed

24. nger tree when removing . . . . 341 12.6.5 append element to the tail of the sequence . . . . . . . . . 346 12.6.6 remove element from the tail of the sequence . . . . . . . 347 12.6.7 concatenate . . . . . . . . . . . . . . . . . . . . . . . . . . 349 12.6.8 Random access of

25. nger tree . . . . . . . . . . . . . . . . 354 12.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 365 V Sorting and Searching 369 13 Divide and conquer, Quick sort vs. Merge sort 371 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 13.2 Quick sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 13.2.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 372 13.2.2 Strict weak ordering . . . . . . . . . . . . . . . . . . . . . 373 13.2.3 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 13.2.4 Minor improvement in functional partition . . . . . . . . 377 13.3 Performance analysis for quick sort . . . . . . . . . . . . . . . . . 379 13.3.1 Average case analysis ? . . . . . . . . . . . . . . . . . . . 380 13.4 Engineering Improvement . . . . . . . . . . . . . . . . . . . . . . 383 13.4.1 Engineering solution to duplicated elements . . . . . . . . 383 13.5 Engineering solution to the worst case . . . . . . . . . . . . . . . 390 13.6 Other engineering practice . . . . . . . . . . . . . . . . . . . . . . 394 13.7 Side words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.8 Merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.8.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 396 13.9 In-place merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 403 13.9.1 Naive in-place merge . . . . . . . . . . . . . . . . . . . . . 403 13.9.2 in-place working area . . . . . . . . . . . . . . . . . . . . 404 13.9.3 In-place merge sort vs. linked-list merge sort . . . . . . . 409 13.10Nature merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 411 13.11Bottom-up merge sort . . . . . . . . . . . . . . . . . . . . . . . . 416 13.12Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 13.13Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 14 Searching 423 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 14.2 Sequence search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 14.2.1 Divide and conquer search . . . . . . . . . . . . . . . . . . 424 14.2.2 Information reuse . . . . . . . . . . . . . . . . . . . . . . . 444 14.3 Solution searching . . . . . . . . . . . . . . . . . . . . . . . . . . 471 14.3.1 DFS and BFS . . . . . . . . . . . . . . . . . . . . . . . . . 471 14.3.2 Search the optimal solution . . . . . . . . . . . . . . . . . 507

26. 8 CONTENTS 14.4 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 VI Appendix 539 Appendices A Lists 541 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 A.2 List De

27. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 A.2.1 Empty list . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 A.2.2 Access the element and the sub list . . . . . . . . . . . . . 542 A.3 Basic list manipulation . . . . . . . . . . . . . . . . . . . . . . . . 543 A.3.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . 543 A.3.2 Empty testing and length calculating . . . . . . . . . . . . 544 A.3.3 indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 A.3.4 Access the last element . . . . . . . . . . . . . . . . . . . 546 A.3.5 Reverse indexing . . . . . . . . . . . . . . . . . . . . . . . 547 A.3.6 Mutating . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 A.3.7 sum and product . . . . . . . . . . . . . . . . . . . . . . . 559 A.3.8 maximum and minimum . . . . . . . . . . . . . . . . . . . 563 A.4 Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 A.4.1 mapping and for-each . . . . . . . . . . . . . . . . . . . . 567 A.4.2 reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 A.5 Extract sub-lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 A.5.1 take, drop, and split-at . . . . . . . . . . . . . . . . . . . 575 A.5.2 breaking and grouping . . . . . . . . . . . . . . . . . . . . 577 A.6 Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 A.6.1 folding from right . . . . . . . . . . . . . . . . . . . . . . . 582 A.6.2 folding from left . . . . . . . . . . . . . . . . . . . . . . . 584 A.6.3 folding in practice . . . . . . . . . . . . . . . . . . . . . . 587 A.7 Searching and matching . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.1 Existence testing . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.2 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.3

28. nding and

29. ltering . . . . . . . . . . . . . . . . . . . . . 589 A.7.4 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 A.8 zipping and unzipping . . . . . . . . . . . . . . . . . . . . . . . . 594 A.9 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 597 GNU Free Documentation License 601 1. APPLICABILITY AND DEFINITIONS . . . . . . . . . . . . . . . 601 2. VERBATIM COPYING . . . . . . . . . . . . . . . . . . . . . . . . 603 3. COPYING IN QUANTITY . . . . . . . . . . . . . . . . . . . . . . 603 4. MODIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 604 5. COMBINING DOCUMENTS . . . . . . . . . . . . . . . . . . . . . 605 6. COLLECTIONS OF DOCUMENTS . . . . . . . . . . . . . . . . . 606 7. AGGREGATION WITH INDEPENDENT WORKS . . . . . . . . 606 8. TRANSLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 9. TERMINATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 10. FUTURE REVISIONS OF THIS LICENSE . . . . . . . . . . . . 607

30. CONTENTS 9 11. RELICENSING . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 ADDENDUM: How to use this License for your documents . . . . . . 608

31. 10 CONTENTS

32. Part I Preface 11

34. Elementary Algorithms 13 0.1 Why? `Are algorithms useful?'. Some programmers say that they seldom use any serious data structures or algorithms in real work such as commercial application development. Even when they need some of them, they have already been provided by libraries. For example, the C++ standard template library (STL) provides sort and selection algorithms as well as the vector, queue, and set data structures. It seems that knowing about how to use the library as a tool is quite enough. Instead of answering this question directly, I would like to say algorithms and data structures are critical in solving `interesting problems', the usefulness of the problem set aside. Let's start with two problems that looks like they can be solved in a brute-force way even by a fresh programmer. 0.2 The smallest free ID problem, the power of algorithms This problem is discussed in Chapter 1 of Richard Bird's book [1]. It's common that applications and systems use ID (identi

35. er) to manage objects and entities. At any time, some IDs are used, and some of them are available for use. When some client tries to acquire a new ID, we want to always allocate it the smallest available one. Suppose IDs are non-negative integers and all IDs in use are kept in a list (or an array) which is not ordered. For example: [18, 4, 8, 9, 16, 1, 14, 7, 19, 3, 0, 5, 2, 11, 6] How can you

36. nd the smallest free ID, which is 10, from the list? It seems the solution is quite easy even without any serious algorithms. 1: function Min-Free(A) 2: x 0 3: loop 4: if x =2 A then 5: return x 6: else 7: x x + 1 Where the =2 is realized like below. 1: function `=2 '(x;X) 2: for i 1 to jXj do 3: if x = X[i] then 4: return False 5: return True Some languages provide handy tools which wrap this linear time process. For example in Python, this algorithm can be directly translated as the following. def b r u t e f o r c e ( l s t ) : i = 0 while True : i f i not in l s t :

37. 14 Preface return i i = i + 1 It seems this problem is trivial. However, There will be millions of IDs in a large system. The speed of this solution is poor in such case for it takes O(n2) time, where n is the length of the ID list. In my computer (2 Cores 2.10 GHz, with 2G RAM), a C program using this solution takes an average of 5.4 seconds to search a minimum free number among 100,000 IDs1. And it takes more than 8 minutes to handle a million numbers. 0.2.1 Improvement 1 The key idea to improve the solution is based on a fact that for a series of n numbers x1; x2; :::; xn, if there are free numbers, some of the xi are outside the range [0; n); otherwise the list is exactly a permutation of 0; 1; :::; n 1 and n should be returned as the minimum free number. It means that max(xi) n1. And we have the following fact. minfree(x1; x2; :::; xn) n (1) One solution is to use an array of n + 1 ags to mark whether a number in range [0; n] is free. 1: function Min-Free(A) 2: F [False; False; :::; False] where jFj = n + 1 3: for 8x 2 A do 4: if x n then 5: F[x] True 6: for i [0; n] do 7: if F[i] = False then 8: return i Line 2 initializes a ag array all of False values. This takes O(n) time. Then the algorithm scans all numbers in A and mark the relative ag to True if the value is less than n, This step also takes O(n) time. Finally, the algorithm performs a linear time search to

38. nd the

39. rst ag with False value. So the total performance of this algorithm is O(n). Note that we use n + 1 ags instead of n ags to cover the special case that sorted(A) = [0; 1; 2; :::; n 1]. Although the algorithm only takes O(n) time, it needs extra O(n) spaces to store the ags. This solution is much faster than the brute force one. On my computer, the relevant Python program takes an average of 0.02 second when dealing with 100,000 numbers. We haven't

40. ne tuned this algorithm yet. Observe that each time we have to allocate memory to create a n + 1 elements array of ags, and release the memory when

41. nished. The memory allocation and release is very expensive thus they cost us a lot of processing time. There are two ways in which we can improve on this solution. One is to allocate the ags array in advance and reuse it for all the calls of our function to

42. nd the smallest free number. The other is to use bit-wise ags instead of a ag array. The following is the C program based on these two minor improvements. 1All programs can be downloaded along with this series posts.

43. 0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS15 #define N 1000000 // 1 mi l l i o n #define WORDLENGTH s izeof ( int ) 8 void s e t b i t (unsigned int bi t s , unsigned int i )f b i t s [ i / WORDLENGTH] j= 1( i % WORDLENGTH) ; g int t e s t b i t (unsigned int bi t s , unsigned int i )f return b i t s [ i /WORDLENGTH] (1( i % WORDLENGTH) ) ; g unsigned int b i t s [N/WORDLENGTH+1] ; int min f r e e ( int xs , int n)f int i , l en = N/WORDLENGTH+1; for ( i =0; il en ; ++i ) b i t s [ i ]=0; for ( i =0; in ; ++i ) i f ( xs [ i ]n) s e t b i t ( bi t s , xs [ i ] ) ; for ( i =0; i=n ; ++i ) i f ( ! t e s t b i t ( bi t s , i ) ) return i ; g This C program can handle 1,000,000 (1 million) IDs in just 0.023 second on my computer. The last for-loop can be further improved as seen below but this is just minor

44. ne-tuning. for ( i =0; ; ++i ) i f (~ b i t s [ i ] !=0 ) for ( j =0; ; ++j ) i f ( ! t e s t b i t ( bi t s , i WORDLENGTH+j ) ) return i WORDLENGTH+j ; 0.2.2 Improvement 2, Divide and Conquer Although the above improvement is much faster, it costs O(n) extra spaces to keep a check list. if n is huge number this means a huge amount of space is wasted. The typical divide and conquer strategy is to break the problem into some smaller ones, and solve these to get the

45. nal answer. We can put all numbers xi bn=2c as a sub-list A0 and put all the others as a second sub-list A00. Based on formula 1 if the length of A0 is exactly bn=2c, this means the

46. rst half of numbers are `full', which indicates that the minimum free number must be in A00 and so we'll need to recursively seek in the shorter list A00. Otherwise, it means the minimum free number is located in A0, which again leads to a smaller problem. When we search the minimum free number in A00, the conditions changes a little bit, we are not searching the smallest free number starting from 0, but

47. 16 Preface actually from bn=2c +1 as the lower bound. So the algorithm is something like minfree(A; l; u), where l is the lower bound and u is the upper bound index of the element. Note that there is a trivial case, that if the number list is empty, we merely return the lower bound as the result. This divide and conquer solution can be formally expressed as a function : minfree(A) = search(A; 0; jAj 1) search(A; l; u) = 8 : l : A = search(A00;m + 1; u) : jA0j = m l + 1 search(A0; l;m) : otherwise where m = b l + u 2 c A0 = f8x 2 A ^ x mg A00 = f8x 2 A ^ x mg It is obvious that this algorithm doesn't need any extra space2. Each call performs O(jAj) comparison to build A0 and A00. After that the problem scale halves. So the time needed for this algorithm is T(n) = T(n=2) + O(n) which reduce to O(n). Another way to analyze the performance is by observing that the

48. rst call takes O(n) to build A0 and A00 and the second call takes O(n=2), and O(n=4) for the third... The total time is O(n+n=2+n=4+:::) = O(2n) = O(n) . In functional programming languages such as Haskell, partitioning a list has already been provided in the basic library and this algorithm can be translated as the following. import Data.List minFree xs = bsearch xs 0 (length xs - 1) bsearch xs l u j xs == [] = l j length as == m - l + 1 = bsearch bs (m+1) u j otherwise = bsearch as l m where m = (l + u) `div` 2 (as, bs) = partition (m) xs 0.2.3 Expressiveness vs. Performance Imperative language programmers may be concerned about the performance of this kind of implementation. For instance in this minimum free ID problem, the number of recursive calls is in O(lg n) , which means the stack size consumed is in O(lg n). It's not free in terms of space. But if we want to avoid that , we 2Procedural programmer may note that it actually takes O(lg n) stack spaces for book- keeping. As we'll see later, this can be eliminated either by tail recursion optimization, for instance gcc -O2. or by manually changing the recursion to iteration

49. 0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS17 can eliminate the recursion by replacing it by an iteration 3 which yields the following C program. int min_free(int xs, int n){ int l=0; int u=n-1; while(n){ int m = (l + u) = 2; int right, left = 0; for(right = 0; right n; ++ right) if(xs[right] m){ swap(xs[left], xs[right]); ++left; } if(left == m - l + 1){ xs = xs + left; n = n - left; l = m+1; } else{ n = left; u = m; } } return l; } This program uses a `quick-sort' like approach to re-arrange the array so that all the elements before left are less than or equal to m; while those between left and right are greater than m. This is shown in

50. gure 1. left right x[i]=m x[i]m ...?... Figure 1: Divide the array, all x[i] m where 0 i left; while all x[i] m where left i right. The left elements are unknown. This program is fast and it doesn't need extra stack space. However, com-pared to the previous Haskell program, it's hard to read and the expressiveness decreased. We have to balance performance and expressiveness. 3This is done automatically in most functional languages since our function is in tail recursive form which lends itself perfectly to this transformation

51. 18 Preface 0.3 The number puzzle, power of data structure If the

52. rst problem, to

53. nd the minimum free number, is a some what useful in practice, this problem is a `pure' one for fun. The puzzle is to

54. nd the 1,500th number, which only contains factor 2, 3 or 5. The

55. rst 3 numbers are of course 2, 3, and 5. Number 60 = 223151, However it is the 25th number. Number 21 = 203171, isn't a valid number because it contains a factor 7. The

56. rst 10 such numbers are list as the following. 2,3,4,5,6,8,9,10,12,15 If we consider 1 = 203050, then 1 is also a valid number and it is the

57. rst one. 0.3.1 The brute-force solution It seems the solution is quite easy without need any serious algorithms. We can check all numbers from 1, then extract all factors of 2, 3 and 5 to see if the left part is 1. 1: function Get-Number(n) 2: x 1 3: i 0 4: loop 5: if Valid?(x) then 6: i i + 1 7: if i = n then 8: return x 9: x x + 1 10: function Valid?(x) 11: while x mod 2 = 0 do 12: x x=2 13: while x mod 3 = 0 do 14: x x=3 15: while x mod 5 = 0 do 16: x x=5 17: if x = 1 then 18: return True 19: else 20: return False This `brute-force' algorithm works for most small n. However, to

58. nd the 1500th number (which is 859963392), the C program based on this algorithm takes 40.39 seconds in my computer. I have to kill the program after 10 minutes when I increased n to 15,000. 0.3.2 Improvement 1 Analysis of the above algorithm shows that modular and divide calculations are very expensive [2]. And they executed a lot in loops. Instead of checking a number contains only 2, 3, or 5 as factors, one alternative solution is to construct such number by these factors.

59. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 19 We start from 1, and times it with 2, or 3, or 5 to generate rest numbers. The problem turns to be how to generate the candidate number in order? One handy way is to utilize the queue data structure. A queue data structure is used to push elements at one end, and pops them at the other end. So that the element be pushed

60. rst is also be popped out

61. rst. This property is called FIFO (First-In-First-Out). The idea is to push 1 as the only element to the queue, then we pop an element, times it with 2, 3, and 5, to get 3 new elements. We then push them back to the queue in order. Note that, the new elements may have already existed in the queue. In such case, we just drop the element. The new element may also smaller than the others in the queue, so we must put them to the correct position. Figure 2 illustrates this idea. 1 1*2=2 1*3=3 1*5=5 2 3 5 2*2=4 2*3=6 2*5=10 3 4 5 6 1 0 3*2=6 3*3=9 3*5=15 4 5 6 9 1 0 1 5 4*2=8 4*3=12 4*5=20 Figure 2: First 4 steps of constructing numbers with a queue. 1. Queue is initialized with 1 as the only element; 2. New elements 2, 3, and 5 are pushed back; 3. New elements 4, 6, and 10, are pushed back in order; 4. New elements 9 and 15 are pushed back, element 6 already exists. This algorithm is shown as the following. 1: function Get-Number(n) 2: Q NIL 3: Enqueue(Q; 1) 4: while n 0 do 5: x Dequeue(Q) 6: Unique-Enqueue(Q; 2x) 7: Unique-Enqueue(Q; 3x) 8: Unique-Enqueue(Q; 5x) 9: n n 1 10: return x 11: function Unique-Enqueue(Q; x) 12: i 0 13: while i jQj ^ Q[i] x do 14: i i + 1 15: if i jQj ^ x = Q[i] then 16: return 17: Insert(Q; i; x)

62. 20 Preface The insert function takes O(jQj) time to

63. nd the proper position and insert it. If the element has already existed, it just returns. A rough estimation tells that the length of the queue increase proportion to n, (Each time, we extract one element, and pushed 3 new, the increase ratio 2), so the total running time is O(1 + 2 + 3 + ::: + n) = O(n2). Figure3 shows the number of queue access time against n. It is quadratic curve which re ect the O(n2) performance. Figure 3: Queue access count v.s. n. The C program based on this algorithm takes only 0.016[s] to get the right answer 859963392. Which is 2500 times faster than the brute force solution. Improvement 1 can also be considered in recursive way. Suppose X is the in

64. nity series for all numbers which only contain factors of 2, 3, or 5. The following formula shows an interesting relationship. X = f1g [ f2x : 8x 2 Xg [ f3x : 8x 2 Xg [ f5x : 8x 2 Xg (2) Where we can de

65. ne [ to a special form so that all elements are stored in order as well as unique to each other. Suppose that X = fx1; x2; x3:::g, Y = fy1; y2; y3; :::g, X0 = fx2; x3; :::g and Y 0 = fy2; y3; :::g. We have X [ Y = 8 : X : Y = Y : X = fx1;X0 [ Y g : x1 y1 fx1;X0 [ Y 0g : x1 = y1 fy1;X [ Y 0g : x1 y1 In a functional programming language such as Haskell, which supports lazy evaluation, The above in

66. nity series functions can be translate into the following program. ns = 1:merge (map (2) ns) (merge (map (3) ns) (map (5) ns)) merge [] l = l merge l [] = l merge (x:xs) (y:ys) j x y = x : merge xs (y:ys)

67. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 21 j x ==y = x : merge xs ys j otherwise = y : merge (x:xs) ys By evaluate ns !! (n-1), we can get the 1500th number as below. ns !! (1500-1) 859963392 0.3.3 Improvement 2 Considering the above solution, although it is much faster than the brute-force one, It still has some drawbacks. It produces many duplicated numbers and they are

68. nally dropped when examine the queue. Secondly, it does linear scan and insertion to keep the order of all elements in the queue, which degrade the ENQUEUE operation from O(1) to O(jQj). If we use three queues instead of using only one, we can improve the solution one step ahead. Denote these queues as Q2, Q3, and Q5, and we initialize them as Q2 = f2g, Q3 = f3g and Q5 = f5g. Each time we DEQUEUEed the smallest one from Q2, Q3, and Q5 as x. And do the following test: If x comes from Q2, we ENQUEUE 2x, 3x, and 5x back to Q2, Q3, and Q5 respectively; If x comes from Q3, we only need ENQUEUE 3x to Q3, and 5x to Q5; We needn't ENQUEUE 2x to Q2, because 2x have already existed in Q3; If x comes from Q5, we only need ENQUEUE 5x to Q5; there is no need to ENQUEUE 3x, 5x to Q3, Q5 because they have already been in the queues; We repeatedly ENQUEUE the smallest one until we

69. nd the n-th element. The algorithm based on this idea is implemented as below. 1: function Get-Number(n) 2: if n = 1 then 3: return 1 4: else 5: Q2 f2g 6: Q3 f3g 7: Q5 f5g 8: while n 1 do 9: x min(Head(Q2), Head(Q3), Head(Q5)) 10: if x = Head(Q2) then 11: Dequeue(Q2) 12: Enqueue(Q2; 2x) 13: Enqueue(Q3; 3x) 14: Enqueue(Q5; 5x) 15: else if x = Head(Q3) then 16: Dequeue(Q3) 17: Enqueue(Q3; 3x) 18: Enqueue(Q5; 5x) 19: else 20: Dequeue(Q5)

70. 22 Preface 2*min=4 3*min=6 5*min=10 2 3 5 min=2 4 3*min=9 5*min=15 3 6 5 1 0 min=3 2*min=8 3*min=12 5*min=20 4 6 9 5 1 0 1 5 min=4 8 6 9 1 2 5 1 0 1 5 2 0 min=5 5*min=25 Figure 4: First 4 steps of constructing numbers with Q2, Q3, and Q5. 1. Queues are initialized with 2, 3, 5 as the only element; 2. New elements 4, 6, and 10 are pushed back; 3. New elements 9, and 15, are pushed back; 4. New elements 8, 12, and 20 are pushed back; 5. New element 25 is pushed back.

71. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 23 21: Enqueue(Q5; 5x) 22: n n 1 23: return x This algorithm loops n times, and within each loop, it extract one head element from the three queues, which takes constant time. Then it appends one to three new elements at the end of queues which bounds to constant time too. So the total time of the algorithm bounds to O(n). The C++ program translated from this algorithm shown below takes less than 1 s to produce the 1500th number, 859963392. typedef unsigned long Integer; Integer get_number(int n){ if(n==1) return 1; queueInteger Q2, Q3, Q5; Q2.push(2); Q3.push(3); Q5.push(5); Integer x; while(n-- 1){ x = min(min(Q2.front(), Q3.front()), Q5.front()); if(x==Q2.front()){ Q2.pop(); Q2.push(x2); Q3.push(x3); Q5.push(x5); } else if(x==Q3.front()){ Q3.pop(); Q3.push(x3); Q5.push(x5); } else{ Q5.pop(); Q5.push(x5); } } return x; } This solution can be also implemented in Functional way. We de

72. ne a func-tion take(n), which will return the

73. rst n numbers contains only factor 2, 3, or 5. take(n) = f(n; f1g; f2g; f3g; f5g) Where f(n;X;Q2;Q3;Q5) = X : n = 1 f(n 1;X [ fxg;Q0 2;Q0 3;Q0 5) : otherwise x = min(Q21;Q31;Q51)

74. 24 Preface 0 2;Q Q 0 3;Q 0 5 = 8 : fQ22;Q23; :::g [ f2xg;Q3 [ f3xg;Q5 [ f5xg : x = Q21 Q2; fQ32;Q33; :::g [ f3xg;Q5 [ f5xg : x = Q31 Q2;Q3; fQ52;Q53; :::g [ f5xg : x = Q51 And these functional de

75. nition can be realized in Haskell as the following. ks 1 xs _ = xs ks n xs (q2, q3, q5) = ks (n-1) (xs++[x]) update where x = minimum $ map head [q2, q3, q5] update j x == head q2 = ((tail q2)++[x2], q3++[x3], q5++[x5]) j x == head q3 = (q2, (tail q3)++[x3], q5++[x5]) j otherwise = (q2, q3, (tail q5)++[x5]) takeN n = ks n [1] ([2], [3], [5]) Invoke `last takeN 1500' will generate the correct answer 859963392. 0.4 Notes and short summary If review the 2 puzzles, we found in both cases, the brute-force solutions are so weak. In the

76. rst problem, it's quite poor in dealing with long ID list, while in the second problem, it doesn't work at all. The

77. rst problem shows the power of algorithms, while the second problem tells why data structure is important. There are plenty of interesting problems, which are hard to solve before computer was invented. With the aid of com-puter and programming, we are able to

78. nd the answer in a quite dierent way. Compare to what we learned in mathematics course in school, we haven't been taught the method like this. While there have been already a lot of wonderful books about algorithms, data structures and math, however, few of them provide the comparison between the procedural solution and the functional solution. From the above discussion, it can be found that functional solution sometimes is very expressive and they are close to what we are familiar in mathematics. This series of post focus on providing both imperative and functional algo-rithms and data structures. Many functional data structures can be referenced from Okasaki's book[6]. While the imperative ones can be founded in classic text books [2] or even in WIKIpedia. Multiple programming languages, includ-ing, C, C++, Python, Haskell, and Scheme/Lisp will be used. In order to make it easy to read by programmers with dierent background, pseudo code and mathematical function are the regular descriptions of each post. The author is NOT a native English speaker, the reason why this book is only available in English for the time being is because the contents are still changing frequently. Any feedback, comments, or criticizes are welcome. 0.5 Structure of the contents In the following series of post, I'll

79. rst introduce about elementary data struc-tures before algorithms, because many algorithms need knowledge of data struc-tures as prerequisite.

80. 0.6. APPENDIX 25 The `hello world' data structure, binary search tree is the

81. rst topic; Then we introduce how to solve the balance problem of binary search tree. After that, I'll show other interesting trees. Trie, Patricia, sux trees are useful in text manipulation. While B-trees are commonly used in

82. le system and data base implementation. The second part of data structures is about heaps. We'll provide a gen-eral Heap de

83. nition and introduce about binary heaps by array and by explicit binary trees. Then we'll extend to K-ary heaps including Binomial heaps, Fi-bonacci heaps, and pairing heaps. Array and queues are considered among the easiest data structures typically, However, we'll show how dicult to implement them in the third part. As the elementary sort algorithms, we'll introduce insertion sort, quick sort, merge sort etc in both imperative way and functional way. The

84. nal part is about searching, besides the element searching, we'll also show string matching algorithms such as KMP. All the posts are provided under GNU FDL (Free document license), and programs are under GNU GPL. 0.6 Appendix All programs provided along with this book are free for downloading. download position: http://sites.google.com/site/algoxy/home

85. 26 Preface

86. Bibliography [1] Richard Bird. Pearls of functional algorithm design. Cambridge Univer-sity Press; 1 edition (November 1, 2010). ISBN-10: 0521513383 [2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 2 edition (October 7, 1999). ISBN-13: 978-0201657883 [3] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. 27

87. 28 BIBLIOGRAPHY

88. Part II Trees 29

90. Chapter 1 Binary search tree, the `hello world' data structure 1.1 Introduction Arrays or lists are typically considered the `hello world' data structures. How-ever, we'll see they are not actually particularly easy to implement. In some procedural settings, arrays are the most elementary data structures, and it is possible to implement linked lists using arrays (see section 10.3 in [2]). On the other hand, in some functional settings, linked lists are the elementary building blocks used to create arrays and other data structures. Considering these factors, we start with Binary Search Trees (or BST) as the `hello world' data structure using an interesting problem Jon Bentley mentioned in `Programming Pearls' [2]. The problem is to count the number of times each word occurs in a large text. One solution in C++ is below: int main(int, char ){ mapstring, int dict; string s; while(cins) ++dict[s]; mapstring, int::iterator it=dict.begin(); for(; it!=dict.end(); ++it) coutit!first: it!secondn; } And we can run it to produce the result using the following UNIX commands 1. $ g++ wordcount.cpp -o wordcount $ cat bbe.txt | ./wordcount wc.txt The map provided in the standard template library is a kind of balanced BST with augmented data. Here we use the words in the text as the keys and the number of occurrences as the augmented data. This program is fast, and 1This is not a UNIX unique command, in Windows OS, it can be achieved by: type bbe:txtjwordcount:exe wc:txt 31

91. 32CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE it re ects the power of BSTs. We'll introduce how to implement BSTs in this section and show how to balance them in a later section. Before we dive into BSTs, let's

92. rst introduce the more general binary tree. Binary trees are recursively de

93. ned. BSTs are just one type of binary tree. A binary tree is usually de

94. ned in the following way. A binary tree is either an empty node; or a node containing 3 parts: a value, a left child which is a binary tree and a right child which is also a binary tree. Figure 1.1 shows this concept and an example binary tree. k L R (a) Concept of binary tree 1 6 4 1 0 1 4 7 2 8 1 9 3 (b) An example binary tree Figure 1.1: Binary tree concept and an example. A BST is a binary tree where the following applies to each node: all the values in left child tree are less than the value of this node; the value of this node is less than any values in its right child tree. Figure 1.2 shows an example of binary search tree. Comparing with Figure 1.1, we can see the dierences in how keys are ordered between them.

95. 1.2. DATA LAYOUT 33 4 3 8 1 2 7 1 6 1 0 9 1 4 Figure 1.2: A Binary search tree example. 1.2 Data Layout Based on the recursive de

96. nition of BSTs, we can draw the data layout in procedural setting with pointers as in Figure 1.3. The node contains a

97. eld of key, which can be augmented with satellite data; a

98. eld contains a pointer to the left child and a

99. eld point to the right child. In order to back-track an ancestor easily, a parent

100. eld can be provided as well. In this post, we'll ignore the satellite data for simple illustration purpose. Based on this layout, the node of binary search tree can be de

101. ned in a proce-dural language, such as C++ as the following. templateclass T struct node{ node(T x):key(x), left(0), right(0), parent(0){} ~node(){ delete left; delete right; } node left; node right; node parent; ==parent is optional, it's helpful for succ=pred T key; }; There is another setting, for instance in Scheme/Lisp languages, the ele-mentary data structure is linked-list. Figure 1.4 shows how a binary search tree node can be built on top of linked-list.

102. 34CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE key + satellite data left right parent key + satellite data left right parent key + satellite data left right parent ... ... ... ... Figure 1.3: Layout of nodes with parent

103. eld. key next left ... next right ... NIL Figure 1.4: Binary search tree node layout on top of linked list. Where `left...' and 'right ...' are either empty or binary search tree node composed in the same way.

104. 1.3. INSERTION 35 Because in pure functional setting, It's hard to use pointer for back tracking the ancestors, (and typically, there is no need to do back tracking, since we can provide top-down solution in recursive way) there is not `parent'

105. eld in such layout. For simpli

106. ed reason, we'll skip the detailed layout in the future, and only focus on the logic layout of data structures. For example, below is the de

107. nition of binary search tree node in Haskell. data Tree a = Empty j Node (Tree a) a (Tree a) 1.3 Insertion To insert a key k (may be along with a value in practice) to a binary search tree T, we can follow a quite straight forward way. If the tree is empty, then construct a leave node with key=k; If k is less than the key of root node, insert it to the left child; If k is greater than the key of root, insert it to the right child; There is an exceptional case that if k is equal to the key of root, it means it has already existed, we can either overwrite the data, or just do nothing. For simple reason, this case is skipped in this post. This algorithm is described recursively. It is so simple that is why we consider binary search tree is `hello world' data structure. Formally, the algorithm can be represented with a recursive function. insert(T; k) = 8 : node(; k; ) : T = node(insert(L; k);Key;R) : k Key node(L;Key; insert(R; k)) : otherwise (1.1) Where L = left(T) R = right(T) Key = key(T) The node function creates a new node with given left sub-tree, a key and a right sub-tree as parameters. means NIL or Empty. function left, right and key are access functions which can get the left sub-tree, right sub-tree and the key of a node. Translate the above functions directly to Haskell yields the following pro-gram. insert::(Ord a) ) Tree a ! a ! Tree a insert Empty k = Node Empty k Empty insert (Node l x r) k j k x = Node (insert l k) x r j otherwise = Node l x (insert r k) This program utilized the pattern matching features provided by the lan-guage. However, even in functional settings without this feature, for instance, Scheme/Lisp, the program is still expressive.

108. 36CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE (define (insert tree x) (cond ((null? tree) (list '() x '())) (( x (key tree)) (make-tree (insert (left tree) x) (key tree) (right tree))) (( x (key tree)) (make-tree (left tree) (key tree) (insert (right tree) x))))) It is possible to turn the algorithm completely into imperative way without recursion. 1: function Insert(T; k) 2: root T 3: x Create-Leaf(k) 4: parent NIL 5: while T6= NIL do 6: parent T 7: if k Key(T) then 8: T Left(T) 9: else 10: T Right(T) 11: Parent(x) parent 12: if parent = NIL then . tree T is empty 13: return x 14: else if k Key(parent) then 15: Left(parent) x 16: else 17: Right(parent) x 18: return root 19: function Create-Leaf(k) 20: x Empty-Node 21: Key(x) k 22: Left(x) NIL 23: Right(x) NIL 24: Parent(x) NIL 25: return x Compare with the functional algorithm, it is obviously that this one is more complex although it is fast and can handle very deep tree. A complete C++ program and a python program are available along with this post for reference. 1.4 Traversing Traversing means visiting every element one by one in a binary search tree. There are 3 ways to traverse a binary tree, pre-order tree walk, in-order tree walk, and post-order tree walk. The names of these traversing methods highlight the order of when we visit the root of a binary search tree.

109. 1.4. TRAVERSING 37 Since there are three parts in a tree, as left child, the root, which con-tains the key and satellite data, and the right child. If we denote them as (left; current; right), the three traversing methods are de

110. ned as the following. pre-order traverse, visit current, then left,

111. nally right; in-order traverse, visit left , then current,

112. nally right; post-order traverse, visit left, then right,

113. nally current. Note that each visiting operation is recursive. And we see the order of visiting current determines the name of the traversing method. For the binary search tree shown in

114. gure 1.2, below are the three dierent traversing results. pre-order traverse result: 4, 3, 1, 2, 8, 7, 16, 10, 9, 14; in-order traverse result: 1, 2, 3, 4, 7, 8, 9, 10, 14, 16; post-order traverse result: 2, 1, 3, 7, 9, 14, 10, 16, 8, 4; It can be found that the in-order walk of a binary search tree outputs the elements in increase order, which is particularly helpful. The de

115. nition of binary search tree ensures this interesting property, while the proof of this fact is left as an exercise of this post. In-order tree walk algorithm can be described as the following: If the tree is empty, just return; traverse the left child by in-order walk, then access the key,

116. nally traverse the right child by in-order walk. Translate the above description yields a generic map function map(f; T) = : T = node(l0; k0; r0) : otherwise (1.2) where l0 = map(f; left(T)) r0 = map(f; right(T)) k0 = f(key(T)) If we only need access the key without create the transformed tree, we can realize this algorithm in procedural way lie the below C++ program. templateclass T, class F void in_order_walk(nodeT t, F f){ if(t){ in_order_walk(t!left, f); f(t!value); in_order_walk(t!right, f); } }

117. 38CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE The function takes a parameter f, it can be a real function, or a function object, this program will apply f to the node by in-order tree walk. We can simpli

118. ed this algorithm one more step to de

119. ne a function which turns a binary search tree to a sorted list by in-order traversing. toList(T) = : T = toList(left(T)) [ fkey(T)g [ toList(right(T)) : otherwise (1.3) Below is the Haskell program based on this de

120. nition. toList::(Ord a))Tree a ! [a] toList Empty = [] toList (Node l x r) = toList l ++ [x] ++ toList r This provides us a method to sort a list of elements. We can

121. rst build a binary search tree from the list, then output the tree by in-order traversing. This method is called as `tree sort'. Let's denote the list X = fx1; x2; x3; :::; xng. sort(X) = toList(fromList(X)) (1.4) And we can write it in function composition form. sort = toList:fromList Where function fromList repeatedly insert every element to a binary search tree. fromList(X) = foldL(insert; ;X) (1.5) It can also be written in partial application form like below. fromList = foldL insert For the readers who are not familiar with folding from left, this function can also be de

122. ned recursively as the following. fromList(X) = : X = insert(fromList(fx2; x3; :::; xng); x1) : otherwise We'll intense use folding function as well as the function composition and partial evaluation in the future, please refer to appendix of this book or [6] [7] and [8] for more information. Exercise 1.1 Given the in-order traverse result and pre-order traverse result, can you re-construct the tree from these result and

123. gure out the post-order traversing result? Pre-order result: 1, 2, 4, 3, 5, 6; In-order result: 4, 2, 1, 5, 3, 6; Post-order result: ? Write a program in your favorite language to re-construct the binary tree from pre-order result and in-order result.

124. 1.5. QUERYING A BINARY SEARCH TREE 39 Prove why in-order walk output the elements stored in a binary search tree in increase order? Can you analyze the performance of tree sort with big-O notation? 1.5 Querying a binary search tree There are three types of querying for binary search tree, searching a key in the tree,

125. nd the minimum or maximum element in the tree, and

126. nd the predecessor or successor of an element in the tree. 1.5.1 Looking up According to the de

127. nition of binary search tree, search a key in a tree can be realized as the following. If the tree is empty, the searching fails; If the key of the root is equal to the value to be found, the search succeed. The root is returned as the result; If the value is less than the key of the root, search in the left child. Else, which means that the value is greater than the key of the root, search in the right child. This algorithm can be described with a recursive function as below. lookup(T; x) = 8 : : T = T : key(T) = x lookup(left(T); x) : x key(T) lookup(right(T); x) : otherwise (1.6) In the real application, we may return the satellite data instead of the node as the search result. This algorithm is simple and straightforward. Here is a translation of Haskell program. lookup::(Ord a)) Tree a ! a ! Tree a lookup Empty _ = Empty lookup t@(Node l k r) x j k == x = t j x k = lookup l x j otherwise = lookup r x If the binary search tree is well balanced, which means that almost all nodes have both non-NIL left child and right child, for N elements, the search algo-rithm takes O(lgN) time to perform. This is not formal de

128. nition of balance. We'll show it in later post about red-black-tree. If the tree is poor balanced, the worst case takes O(N) time to search for a key. If we denote the height of the tree as h, we can uniform the performance of the algorithm as O(h). The search algorithm can also be realized without using recursion in a pro-cedural manner. 1: function Search(T; x) 2: while T6= NIL^ Key(T)6= x do

129. 40CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 3: if x Key(T) then 4: T Left(T) 5: else 6: T Right(T) 7: return T Below is the C++ program based on this algorithm. templateclass T nodeT search(nodeT t, T x){ while(t t!key!=x){ if(x t!key) t=t!left; else t=t!right; } return t; } 1.5.2 Minimum and maximum Minimum and maximum can be implemented from the property of binary search tree, less keys are always in left child, and greater keys are in right. For minimum, we can continue traverse the left sub tree until it is empty. While for maximum, we traverse the right. min(T) = key(T) : left(T) = min(left(T)) : otherwise (1.7) max(T) = key(T) : right(T) = max(right(T)) : otherwise (1.8) Both function bound to O(h) time, where h is the height of the tree. For the balanced binary search tree, min/max are bound to O(lgN) time, while they are O(N) in the worst cases. We skip translating them to programs, It's also possible to implement them in pure procedural way without using recursion. 1.5.3 Successor and predecessor The last kind of querying, to

130. nd the successor or predecessor of an element is useful when a tree is treated as a generic container and traversed by using iterator. It will be relative easier to implement if parent of a node can be accessed directly. It seems that the functional solution is hard to be found, because there is no pointer like

131. eld linking to the parent node. One solution is to left `breadcrumbs' when we visit the tree, and use these information to back-track or even re-construct the whole tree. Such data structure, that contains both the tree and `breadcrumbs' is called zipper. please refer to [9] for details. However, If we consider the original purpose of providing succ/pred func-tion, `to traverse all the binary search tree elements one by one` as a generic container, we realize that they don't make signi

132. cant sense in functional settings because we can traverse the tree in increase order by mapT function we de

133. ned previously.

134. 1.5. QUERYING A BINARY SEARCH TREE 41 We'll meet many problems in this series of post that they are only valid in imperative settings, and they are not meaningful problems in functional settings at all. One good example is how to delete an element in red-black-tree[3]. In this section, we'll only present the imperative algorithm for

135. nding the successor and predecessor in a binary search tree. When

136. nding the successor of element x, which is the smallest one y that satis

137. es y x, there are two cases. If the node with value x has non-NIL right child, the minimum element in right child is the answer; For example, in Figure 1.2, in order to

138. nd the successor of 8, we search it's right sub tree for the minimum one, which yields 9 as the result. While if node x don't have right child, we need back-track to

139. nd the closest ancestors whose left child is also ancestor of x. In Figure 1.2, since 2 don't have right sub tree, we go back to its parent 1. However, node 1 don't have left child, so we go back again and reach to node 3, the left child of 3, is also ancestor of 2, thus, 3 is the successor of node 2. Based on this description, the algorithm can be given as the following. 1: function Succ(x) 2: if Right(x)6= NIL then 3: return Min(Right(x)) 4: else 5: p Parent(x) 6: while p6= NIL and x = Right(p) do 7: x p 8: p Parent(p) 9: return p The predecessor case is quite similar to the successor algorithm, they are symmetrical to each other. 1: function Pred(x) 2: if Left(x)6= NIL then 3: return Max(Left(x)) 4: else 5: p Parent(x) 6: while p6= NIL and x = Left(p) do 7: x p 8: p Parent(p) 9: return p Below are the Python programs based on these algorithms. They are changed a bit in while loop conditions. def succ(x): if x.right is not None: return tree_min(x.right) p = x.parent while p is not None and p.left != x: x = p p = p.parent return p def pred(x): if x.left is not None: return tree_max(x.left) p = x.parent

140. 42CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE while p is not None and p.right != x: x = p p = p.parent return p Exercise 1.2 Can you

141. gure out how to iterate a tree as a generic container by using pred()/succ()? What's the performance of such traversing process in terms of big-O? A reader discussed about traversing all elements inside a range [a; b]. In C++, the algorithm looks like the below code: for each(m:lower bound(12); m:upper bound(26); f); Can you provide the purely function solution for this problem? 1.6 Deletion Deletion is another `imperative only' topic for binary search tree. This is because deletion mutate the tree, while in purely functional settings, we don't modify the tree after building it in most application. However, One method of deleting element from binary search tree in purely functional way is shown in this section. It's actually reconstructing the tree but not modifying the tree. Deletion is the most complex operation for binary search tree. this is because we must keep the BST property, that for any node, all keys in left sub tree are less than the key of this node, and they are all less than any keys in right sub tree. Deleting a node can break this property. In this post, dierent with the algorithm described in [2], A simpler one from SGI STL implementation is used.[6] To delete a node x from a tree. If x has no child or only one child, splice x out; Otherwise (x has two children), use minimum element of its right sub tree to replace x, and splice the original minimum element out. The simplicity comes from the truth that, the minimum element is stored in a node in the right sub tree, which can't have two non-NIL children. It ends up in the trivial case, the the node can be directly splice out from the tree. Figure 1.5, 1.6, and 1.7 illustrate these dierent cases when deleting a node from the tree. Based on this idea, the deletion can be de

142. ned as the below function. delete(T; x) = 8 : : T = node(delete(L; x);K;R) : x K node(L;K; delete(R; x)) : x K R : x = K ^ L = L : x = K ^ R = node(L; y; delete(R; y)) : otherwise (1.9)

143. 1.6. DELETION 43 Tree x NIL NIL Figure 1.5: x can be spliced out. Tree x L NIL (a) Before delete x Tree L (b) After delete x x is spliced out, and replaced by its left child. Tree x NIL R (c) Before delete x Tree R (d) Before delete x x is spliced out, and replaced by its right child. Figure 1.6: Delete a node which has only one non-NIL child.

144. 44CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE Tree x L R (a) Before delete x Tree min(R) L delete(R, min(R)) (b) After delete x x is replaced by splicing the minimum element from its right child. Figure 1.7: Delete a node which has both children. Where L = left(T) R = right(T) K = key(T) y = min(R) Translating the function to Haskell yields the below program. delete::(Ord a)) Tree a ! a ! Tree a delete Empty _ = Empty delete (Node l k r) x j x k = (Node (delete l x) k r) j x k = (Node l k (delete r x)) -- x == k j isEmpty l = r j isEmpty r = l j otherwise = (Node l k' (delete r k')) where k' = min r Function `isEmpty' is used to test if a tree is empty (). Note that the algorithm

145. rst performs search to locate the node where the element need be deleted, after that it execute the deletion. This algorithm takes O(h) time where h is the height of the tree. It's also possible to pass the node but not the element to the algorithm for deletion. Thus the searching is no more needed. The imperative algorithm is more complex because it need set the parent properly. The function will return the root of the result tree. 1: function Delete(T; x) 2: root T

146. 1.6. DELETION 45 3: x0 x . save x 4: parent Parent(x) 5: if Left(x) = NIL then 6: x Right(x) 7: else if Right(x) = NIL then 8: x Left(x) 9: else . both children are non-NIL 10: y Min(Right(x)) 11: Key(x) Key(y) 12: Copy other satellite data from y to x 13: if Parent(y)6= x then . y hasn't left sub tree 14: Left(Parent(y)) Right(y) 15: else . y is the root of right child of x 16: Right(x) Right(y) 17: Remove y 18: return root 19: if x6= NIL then 20: Parent(x) parent 21: if parent = NIL then . We are removing the root of the tree 22: root x 23: else 24: if Left(parent) = x0 then 25: Left(parent) x 26: else 27: Right(parent) x 28: Remove x0 29: return root Here we assume the node to be deleted is not empty (otherwise we can simply returns the original tree). In other cases, it will

147. rst record the root of the tree, create copy pointers to x, and its parent. If either of the children is empty, the algorithm just splice x out. If it has two non-NIL children, we

148. rst located the minimum of right child, replace the key of x to y's, copy the satellite data as well, then splice y out. Note that there is a special case that y is the root node of x's left sub tree. Finally we need reset the stored parent if the original x has only one non- NIL child. If the parent pointer we copied before is empty, it means that we are deleting the root node, so we need return the new root. After the parent is set properly, we

149. nally remove the old x from memory. The relative Python program for deleting algorithm is given as below. Be-cause Python provides GC, we needn't explicitly remove the node from the memory. def tree_delete(t, x): if x is None: return t [root, old_x, parent] = [t, x, x.parent] if x.left is None: x = x.right elif x.right is None: x = x.left

150. 46CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE else: y = tree_min(x.right) x.key = y.key if y.parent != x: y.parent.left = y.right else: x.right = y.right return root if x is not None: x.parent = parent if parent is None: root = x else: if parent.left == old_x: parent.left = x else: parent.right = x return root Because the procedure seeks minimum element, it runs in O(h) time on a tree of height h. Exercise 1.3 There is a symmetrical solution for deleting a node which has two non-NIL children, to replace the element by splicing the maximum one out o the left sub-tree. Write a program to implement this solution. 1.7 Randomly build binary search tree It can be found that all operations given in this post bound to O(h) time for a tree of height h. The height aects the performance a lot. For a very unbalanced tree, h tends to be O(N), which leads to the worst case. While for balanced tree, h close to O(lgN). We can gain the good performance. How to make the binary search tree balanced will be discussed in next post. However, there exists a simple way. Binary search tree can be randomly built as described in [2]. Randomly building can help to avoid (decrease the possibility) unbalanced binary trees. The idea is that before building the tree, we can call a random process, to shue the elements. Exercise 1.4 Write a randomly building process for binary search tree. 1.8 Appendix All programs are provided along with this post. They are free for downloading. We provided C, C++, Python, Haskell, and Scheme/Lisp programs as example.

151. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 2 edition (October 7, 1999). ISBN-13: 978-0201657883 [3] Chris Okasaki. Ten Years of Purely Functional Data Structures. http://okasaki.blogspot.com/2008/02/ten-years-of-purely-functional-data. html [4] SGI. Standard Template Library Programmer's Guide. http://www.sgi.com/tech/stl/ [5] http://en.literateprograms.org/Category:Binary search tree [6] http://en.wikipedia.org/wiki/Foldl [7] http://en.wikipedia.org/wiki/Function composition [8] http://en.wikipedia.org/wiki/Partial application [9] Miran Lipovaca. Learn You a Haskell for Great Good! A Beginner's Guide. the last chapter. No Starch Press; 1 edition April 2011, 400 pp. ISBN: 978-1-59327-283-8 47

152. 48 The evolution of insertion sort

153. Chapter 2 The evolution of insertion sort 2.1 Introduction In previous chapter, we introduced the 'hello world' data structure, binary search tree. In this chapter, we explain insertion sort, which can be think of the 'hello world' sorting algorithm 1. It's straightforward, but the performance is not as good as some divide and conqueror sorting approaches, such as quick sort and merge sort. Thus insertion sort is seldom used as generic sorting utility in modern software libraries. We'll analyze the problems why it is slow, and trying to improve it bit by bit till we reach the best bound of comparison based sorting algorithms, O(n lg n), by evolution to tree sort. And we

154. nally show the connection between the 'hello world' data structure and 'hello world' sorting algorithm. The idea of insertion sort can be vivid illustrated by a real life poker game[2]. Suppose the cards are shued, and a player starts taking card one by one. At any time, all cards in player's hand are well sorted. When the player gets a new card, he insert it in proper position according to the order of points. Figure 2.1 shows this insertion example. Based on this idea, the algorithm of insertion sort can be directly given as the following. function Sort(A) X for each x 2 A do Insert(X; x) return X It's easy to express this process with folding, which we mentioned in the chapter of binary search tree. insert = foldL insert (2.1) 1Some reader may argue that 'Bubble sort' is the easiest sort algorithm. Bubble sort isn't covered in this book as we don't think it's a valuable algorithm[1] 49

155. 50 CHAPTER 2. THE EVOLUTION OF INSERTION SORT Figure 2.1: Insert card 8 to proper position in a deck. Note that in above algorithm, we store the sorted result in X, so this isn't in-place sorting. It's easy to change it to in-place algorithm. function Sort(A) for i 2 to Length(A) do insert Ai to sorted sequence fA0 1;A0 2; :::;A0 i1 g At any time, when we process the i-th element, all elements before i have already been sorted. we continuously insert the current elements until consume all the unsorted data. This idea is illustrated as in

156. gure 9.3. insert ... sorted elements ... x ... unsorted elements ... Figure 2.2: The left part is sorted data, continuously insert elements to sorted part. We can

157. nd there is recursive concept in this de

158. nition. Thus it can be expressed as the following. sort(A) = : A = insert(sort(fA2;A3; :::g);A1) : otherwise (2.2) 2.2 Insertion We haven't answered the question about how to realize insertion however. It's a puzzle how does human locate the proper position so quickly. For computer, it's an obvious option to perform a scan. We can either scan from left to right or vice versa. However, if the sequence is stored in plain array, it's necessary to scan from right to left. function Sort(A) for i 2 to Length(A) do . Insert A[i] to sorted sequence A[1:::i 1]

159. 2.2. INSERTION 51 x A[i] j i 1 while j 0 ^ x A[j] do A[j + 1] A[j] j j 1 A[j + 1] x One may think scan from left to right is natural. However, it isn't as eect as above algorithm for plain array. The reason is that, it's expensive to insert an element in arbitrary position in an array. As array stores elements continuously. If we want to insert new element x in position i, we must shift all elements after i, including i+1; i+2; ::: one position to right. After that the cell at position i is empty, and we can put x in it. This is illustrated in

160. gure 2.3. x insert A[1] A[2] ... A[i-1] A[i] A[i+1] A[i+2] ... A[n-1] A[n] empty Figure 2.3: Insert x to array A at position i. If the length of array is n, this indicates we need examine the

161. rst i elements, then perform n i + 1 moves, and then insert x to the i-th cell. So insertion from left to right need traverse the whole array anyway. While if we scan from right to left, we totally examine the last j = n i + 1 elements, and perform the same amount of moves. If j is small (e.g. less than n=2), there is possibility to perform less operations than scan from left to right. Translate the above algorithm to Python yields the following code. def isort(xs): n = len(xs) for i in range(1, n): x = xs[i] j = i - 1 while j 0 and x xs[j]: xs[j+1] = xs[j] j = j - 1 xs[j+1] = x It can be found some other equivalent programs, for instance the following ANSI C program. However this version isn't as eective as the pseudo code. void isort(Key xs, int n){ int i, j; for(i=1; in; ++i) for(j=i-1; j0 xs[j+1] xs[j]; --j) swap(xs, j, j+1); }

162. 52 CHAPTER 2. THE EVOLUTION OF INSERTION SORT This is because the swapping function, which can exchange two elements typically uses a temporary variable like the following: void swap(Key xs, int i, int j){ Key temp = xs[i]; xs[i] = xs[j]; xs[j] = temp; } So the ANSI C program presented above takes 3m times assignment, where m is the number of inner loops. While the pseudo code as well as the Python program use shift operation instead of swapping. There are n + 2 times assign-ment. We can also provide Insert() function explicitly, and call it from the general insertion sort algorithm in previous section. We skip the detailed realization here and left it as an exercise. All the insertion algorithms are bound to O(n), where n is the length of the sequence. No matter what dierence among them, such as scan from left or from right. Thus the over all performance for insertion sort is quadratic as O(n2). Exercise 2.1 Provide explicit insertion function, and call it with general insertion sort algorithm. Please realize it in both procedural way and functional way. 2.3 Improvement 1 Let's go back to the question, that why human being can

163. nd the proper position for insertion so quickly. We have shown a solution based on scan. Note the fact that at any time, all cards at hands have been well sorted, another possible solution is to use binary search to

164. nd that location. We'll explain the search algorithms in other dedicated chapter. Binary search is just brie y introduced for illustration purpose here. The algorithm will be changed to call a binary search procedure. function Sort(A) for i 2 to Length(A) do x A[i] p Binary-Search(A[1:::i 1]; x) for j i down to p do A[j] A[j 1] A[p] x Instead of scan elements one by one, binary search utilize the information that all elements in slice of array fA1; :::;Ai1g are sorted. Let's assume the order is monotonic increase order. To

165. nd a position j that satis

166. es Aj1 x Aj . We can

167. rst examine the middle element, for example, Abi=2c. If x is less than it, we need next recursively perform binary search in the

168. rst half of the sequence; otherwise, we only need search in last half. Every time, we halve the elements to be examined, this search process runs O(lg n) time to locate the insertion position. function Binary-Search(A; x)

169. 2.4. IMPROVEMENT 2 53 l 1 u 1+ Length(A) while l u do m b l+u 2 c if Am = x then return m . Find a duplicated element else if Am x then l m + 1 else u m return l The improved insertion sort algorithm is still bound to O(n2), compare to previous section, which we use O(n2) times comparison and O(n2) moves, with binary search, we just use O(n lg n) times comparison and O(n2) moves. The Python program regarding to this algorithm is given below. def isort(xs): n = len(xs) for i in range(1, n): x = xs[i] p = binary_search(xs[:i], x) for j in range(i, p, -1): xs[j] = xs[j-1] xs[p] = x def binary_search(xs, x): l = 0 u = len(xs) while l u: m = (l+u)=2 if xs[m] == x: return m elif xs[m] x: l = m + 1 else: u = m return l Exercise 2.2 Write the binary search in recursive manner. You needn't use purely func-tional programming language. 2.4 Improvement 2 Although we improve the search time to O(n lg n) in previous section, the num-ber of moves is still O(n2). The reason of why movement takes so long time, is because the sequence is stored in plain array. The nature of array is continu-ously layout data structure, so the insertion operation is expensive. This hints us that we can use linked-list setting to represent the sequence. It can improve

170. 54 CHAPTER 2. THE EVOLUTION OF INSERTION SORT the insertion operation from O(n) to constant time O(1). insert(A; x) = 8 : fxg : A = fxg [ A : x A1 fA1g [ insert(fA2;A3; :::Ang; x) : otherwise (2.3) Translating the algorithm to Haskell yields the below program. insert :: (Ord a) ) [a] ! a ! [a] insert [] x = [x] insert (y:ys) x = if x y then x:y:ys else y:insert ys x And we can complete the two versions of insertion sort program based on the

171. rst two equations in this chapter. isort [] = [] isort (x:xs) = insert (isort xs) x Or we can represent the recursion with folding. isort = foldl insert [] Linked-list setting solution can also be described imperatively. Suppose function Key(x), returns the value of element stored in node x, and Next(x) accesses the next node in the linked-list. function Insert(L; x) p NIL H L while L6= NIL^ Key(L) Key(x) do p L L Next(L) Next(x) L if p6= NIL then H x else Next(p) x return H For example in ANSI C, the linked-list can be de

172. ned as the following. struct node{ Key key; struct node next; }; Thus the insert function can be given as below. struct node insert(struct node lst, struct node x){ struct node p, head; p = NULL; for(head = lst; lst x!key lst!key; lst = lst!next) p = lst; x!next = lst; if(!p) return x; p!next = x;

173. 2.5. FINAL IMPROVEMENT BY BINARY SEARCH TREE 55 return head; } Instead of using explicit linked-list such as by pointer or reference based structure. Linked-list can also be realized by another index array. For any array element Ai, Nexti stores the index of next element follows Ai. It means ANexti is the next element after Ai. The insertion algorithm based on this solution is given like below. function Insert(A;Next; i) j ? while Nextj6= NIL ^ ANextj Ai do j Nextj Nexti Nextj Nextj i Here ? means the head of the Next table. And the relative Python program for this algorithm is given as the following. def isort(xs): n = len(xs) next = [-1](n+1) for i in range(n): insert(xs, next, i) return next def insert(xs, next, i): j = -1 while next[j] != -1 and xs[next[j]] xs[i]: j = next[j] next[j], next[i] = i, next[j] Although we change the insertion operation to constant time by using linked-list. However, we have to traverse the linked-list to

174. nd the position, which results O(n2) times comparison. This is because linked-list, unlike array, doesn't support random access. It means we can't use binary search with linked-list setting. Exercise 2.3 Complete the insertion sort by using linked-list insertion function in your favorate imperative programming language. The index based linked-list return the sequence of rearranged index as result. Write a program to re-order the original array of elements from this result. 2.5 Final improvement by binary search tree It seems that we drive into a corner. We must improve both the comparison and the insertion at the same time, or we will end up with O(n2) performance. We must use binary search, this is the only way to improve the comparison time to O(lg n). On the other hand, we must change the data structure, because we can't achieve constant time insertion at a position with plain array.

175. 56 CHAPTER 2. THE EVOLUTION OF INSERTION SORT This remind us about our 'hello world' data structure, binary search tree. It naturally support binary search from its de

176. nition. At the same time, We can insert a new leaf in binary search tree in O(1) constant time if we already

177. nd the location. So the algorithm changes to this. function Sort(A) T for each x 2 A do T Insert-Tree(T; x) return To-List(T) Where Insert-Tree() and To-List() are described in previous chapter about binary search tree. As we have analyzed for binary search tree, the performance of tree sort is bound to O(n lg n), which is the lower limit of comparison based sort[3]. 2.6 Short summary In this chapter, we present the evolution process of insertion sort. Insertion sort is well explained in most textbooks as the

178. rst sorting algorithm. It has simple and straightforward idea, but the performance is quadratic. Some textbooks stop here, but we want to show that there exist ways to improve it by dierent point of view. We

179. rst try to save the comparison time by using binary search, and then try to save the insertion operation by changing the data structure to linked-list. Finally, we combine these two ideas and evolute insertion sort to tree sort.

180. Bibliography [1] http://en.wikipedia.org/wiki/Bubble sort [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [3] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 57

181. 58 Red black tree

182. Chapter 3 Red-black tree, not so complex as it was thought 3.1 Introduction 3.1.1 Exploit the binary search tree We have shown the power of binary search tree by using it to count the occur-rence of every word in Bible. The idea is to use binary search tree as a dictionary for counting. One may come to the idea that to feed a yellow page book 1 to a binary search tree, and use it to look up the phone number for a contact. By modifying a bit of the program for word occurrence counting yields the following code. int main(int, char ){ ifstream f(yp.txt); mapstring, string dict; string name, phone; while(fname fphone) dict[name]=phone; for(;;){ coutnname: ; cinname; if(dict.find(name)==dict.end()) coutnot found; else coutphone: dict[name]; } } This program works well. However, if you replace the STL map with the bi-nary search tree as mentioned previously, the performance will be bad, especially when you search some names such as Zara, Zed, Zulu. This is because the content of yellow page is typically listed in lexicographic order. Which means the name list is in increase order. If we try to insert a 1A name-phone number contact list book 59

183. 60CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT sequence of number 1, 2, 3, ..., n to a binary search tree. We will get a tree like in Figure 3.1. 1 2 3 ... n Figure 3.1: unbalanced tree This is a extreme unbalanced binary search tree. The looking up performs O(h) for a tree with height h. In balanced case, we bene

184. t from binary search tree by O(lgN) search time. But in this extreme case, the search time down-graded to O(N). It's no better than a normal link-list. Exercise 3.1 For a very big yellow page list, one may want to speed up the dictionary building process by two concurrent tasks (threads or processes). One task reads the name-phone pair from the head of the list, while the other one reads from the tail. The building terminates when these two tasks meet at the middle of the list. What will be the binary search tree looks like after building? What if you split the the list more than two and use more tasks? Can you

185. nd any more cases to exploit a binary search tree? Please consider the unbalanced trees shown in

186. gure 3.2. 3.1.2 How to ensure the balance of the tree In order to avoid such case, we can shue the input sequence by randomized algorithm, such as described in Section 12.4 in [2]. However, this method doesn't always work, for example the input is fed from user interactively, and the tree need to be built/updated after each input. There are many solutions people have ever found to make binary search tree balanced. Many of them rely on the rotation operations to binary search tree. Rotation operations change the tree structure while maintain the ordering of the elements. Thus it either improve or keep the balance property of the binary search tree.

187. 3.1. INTRODUCTION 61 n n-1 n-2 ... 1 (a) 1 2 n 3 n-1 4 ... (b) m m-1 m+1 m-2 ... 1 m+2 ... n (c) Figure 3.2: Some unbalanced trees

188. 62CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT In this chapter, we'll

189. rst introduce about red-black tree, which is one of the most popular and widely used self-adjusting balanced binary search tree. In next chapter, we'll introduce about AVL tree, which is another intuitive solution; In later chapter about binary heaps, we'll show another interesting tree called splay tree, which can gradually adjust the the tree to make it more and more balanced. 3.1.3 Tree rotation () X a Y b c (a) Y X c a b (b) Figure 3.3: Tree rotation, `rotate-left' transforms the tree from left side to right side, and `rotate-right' does the inverse transformation. Tree rotation is a kind of special operation that can transform the tree structure without changing the in-order traverse result. It based on the fact that for a speci

190. ed ordering, there are multiple binary search trees correspond to it. Figure 3.3 shows the tree rotation. For a binary search tree on the left side, left rotate transforms it to the tree on the right, and right rotate does the inverse transformation. Although tree rotation can be realized in procedural way, there exists quite simple function description if using pattern matching. rotateL(T) = node(node(a;X; b); Y; c) : pattern(T) = node(a;X; node(b; Y; c)) T : otherwise (3.1) rotateR(T) = node(a;X; node(b; Y; c)) : pattern(T) = node(node(a;X; b); Y; c)) T : otherwise (3.2) However, the pseudo code dealing imperatively has to set all

191. elds accord-ingly. 1: function Left-Rotate(T; x) 2: p Parent(x) 3: y Right(x) . Assume y6= NIL 4: a Left(x) 5: b Left(y) 6: c Right(y)

192. 3.1. INTRODUCTION 63 7: Replace(x; y) 8: Set-Children(x; a; b) 9: Set-Children(y; x; c) 10: if p = NIL then 11: T y 12: return T 13: function Right-Rotate(T; y) 14: p Parent(y) 15: x Left(y) . Assume x6= NIL 16: a Left(x) 17: b Right(x) 18: c Right(y) 19: Replace(y; x) 20: Set-Children(y; b; c) 21: Set-Children(x; a; y) 22: if p = NIL then 23: T x 24: return T 25: function Set-Left(x; y) 26: Left(x) y 27: if y6= NIL then Parent(y) x 28: function Set-Right(x; y) 29: Right(x) y 30: if y6= NIL then Parent(y) x 31: function Set-Children(x; L;R) 32: Set-Left(x;L) 33: Set-Right(x;R) 34: function Replace(x; y) 35: if Parent(x) = NIL then 36: if y6= NIL then Parent(y) NIL 37: else if Left(Parent(x)) = x then Set-Left(Parent(x), y) 38: elseSet-Right(Parent(x), y) 39: Parent(x) NIL Compare these pseudo codes with the pattern matching functions, the former focus on the structure states changing, while the latter focus on the rotation process. As the title of this chapter indicated, red-black tree needn't be so complex as it was thought. Most traditional algorithm text books use the classic procedural way to teach red-black tree, there are several cases need to deal and all need carefulness to manipulate the node

193. elds. However, by changing the mind to functional settings, things get intuitive and simple. Although there is some performance overhead. Most of the content in this chapter is based on Chris Okasaki's work in [2].

194. 64CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 3.2 De

195. nition of red-black tree Red-black tree is a type of self-balancing binary search tree[3]. 2 By using color changing and rotation, red-black tree provides a very simple and straightforward way to keep the tree balanced. For a binary search tree, we can augment the nodes with a color

196. eld, a node can be colored either red or black. We call a binary search tree red-black tree if it satis

197. es the following 5 properties[2]. 1. Every node is either red or black. 2. The root is black. 3. Every leaf (NIL) is black. 4. If a node is red, then both its children are black. 5. For each node, all paths from the node to descendant leaves contain the same number of black nodes. Why this 5 properties can ensure the red-black tree is well balanced? Because they have a key characteristic, the longest path from root to a leaf can't be as 2 times longer than the shortest path. Please note the 4-th property, which means there won't be two adjacent red nodes. so the shortest path only contains black nodes, any paths longer than the shortest one has interval red nodes. According to property 5, all paths have the same number of black nodes, this

198. nally ensure there won't be any path is 2 times longer than others[3]. Figure 3.4 shows an example red-black tree. 1 3 8 1 7 1 1 1 6 1 5 2 5 2 2 2 7 Figure 3.4: An example red-black tree 2Red-black tree is one of the equivalent form of 2-3-4 tree (see chapter B-tree about 2-3-4 tree). That is to say, for any 2-3-4 tree, there is at least one red-black tree has the same data order.

199. 3.3. INSERTION 65 All read only operations such as search, min/max are as same as in binary search tree. While only the insertion and deletion are special. As we have shown in word occurrence example, many implementation of set or map container are based on red-black tree. One example is the C++ Standard library (STL)[6]. As mentioned previously, the only change in data layout is that there is color information augmented to binary search tree. This can be represented as a data

200. eld in imperative languages such as C++ like below. enum Color {Red, Black}; template class T struct node{ Color color; T key; node left; node right; node parent; }; In functional settings, we can add the color information in constructors, below is the Haskell example of red-black tree de

201. nition. data Color = R j B data RBTree a = Empty j Node Color (RBTree a) a (RBTree a) Exercise 3.2 Can you prove that a red-black tree with n nodes has height at most 2 lg(n + 1)? 3.3 Insertion Inserting a new node as what has been mentioned in binary search tree may cause the tree unbalanced. The red-black properties has to be maintained, so we need do some

202. xing by transform the tree after insertion. When we insert a new key, one good practice is to always insert it as a red node. As far as the new inserted node isn't the root of the tree, we can keep all properties except the 4-th one. that it may bring two adjacent red nodes. Functional and procedural implementation have dierent

203. xing methods. One is intuitive but has some overhead, the other is a bit complex but has higher performance. Most text books about algorithm introduce the later one. In this chapter, we focus on the former to show how easily a red-black tree insertion algorithm can be realized. The traditional procedural method will be given only for comparison purpose. As described by Chris Okasaki, there are total 4 cases which violate property 4. All of them has 2 adjacent red nodes. However, they have a uniformed form after

204. xing[2] as shown in

205. gure 4.3. Note that this transformation will move the redness one level up. So this is a bottom-up recursive

206. xing, the last step will make the root node red. According

207. 66CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT @ @ @R @ @I @ z y D x C A B z x D A y B C x A y B z C D x A z y D B C y x z A B C D Figure 3.5: 4 cases for balancing a red-black tree after insertion

208. 3.3. INSERTION 67 to property 2, root is always black. Thus we need

209. nal

210. xing to revert the root color to black. Observing that the 4 cases and the

211. xed result have strong pattern features, the

212. xing function can be de

213. ned by using the similar method we mentioned in tree rotation. To avoid too long formula, we abbreviate Color as C, Black as B, and Red as R. balance(T) = node(R; node(B; A; x;B); y; node(B;C; z;D)) : match(T) T : otherwise (3.3) where function node() can construct a red-black tree node with 4 parameters as color, the left child, the key and the right child. Function match() can test if a tree is one of the 4 possible patterns as the following. match(T) = pattern(T) = node(B; node(R; node(R; A; x;B); y;C); z;D)_ pattern(T) = node(B; node(R; A; x; node(R;B; y;C); z;D))_ pattern(T) = node(B; A; x; node(R;B; y; node(R;C; z;D)))_ pattern(T) = node(B; A; x; node(R; node(R;B; y;C); z;D)) With function balance() de

214. ned, we can modify the previous binary search tree insertion functions to make it work for red-black tree. insert(T; k) = makeBlack(ins(T; k)) (3.4) where ins(T; k) = 8 : node(R; ; k; ) : T = balance(node(ins(L; k);Key;R)) : k Key balance(node(L;Key; ins(R; k))) : otherwise (3.5) L;R;Key represent the left child, right child and the key of a tree. L = left(T) R = right(T) Key = key(T) Function makeBlack() is de

215. ned as the following, it forces the color of a non-empty tree to be black. makeBlack(T) = node(B; L;Key;R) (3.6) Summarize the above functions and use language supported pattern match-ing features, we can come to the following Haskell program. insert::(Ord a))RBTree a ! a ! RBTree a insert t x = makeBlack $ ins t where ins Empty = Node R Empty x Empty ins (Node color l k r) j x k = balance color (ins l) k r j otherwise = balance color l k (ins r) --[3] makeBlack(Node _ l k r) = Node B l k r

216. 68CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT balance::Color ! RBTree a ! a ! RBTree a ! RBTree a balance B (Node R (Node R a x b) y c) z d = Node R (Node B a x b) y (Node B c z d) balance B (Node R a x (Node R b y c)) z d = Node R (Node B a x b) y (Node B c z d) balance B a x (Node R b y (Node R c z d)) = Node R (Node B a x b) y (Node B c z d) balance B a x (Node R (Node R b y c) z d) = Node R (Node B a x b) y (Node B c z d) balance color l k r = Node color l k r Note that the 'balance' function is changed a bit from the original de

217. nition. Instead of passing the tree, we pass the color, the left child, the key and the right child to it. This can save a pair of `boxing' and 'un-boxing' operations. This program doesn't handle the case of inserting a duplicated key. However, it is possible to handle it either by overwriting, or skipping. Another option is to augment the data with a linked list[2]. Figure 3.6 shows two red-black trees built from feeding list 11, 2, 14, 1, 7, 15, 5, 8, 4 and 1, 2, ..., 8. 1 4 7 2 1 5 1 1 1 5 4 8 1 2 3 4 6 5 7 8 Figure 3.6: insert results generated by the Haskell algorithm This algorithm shows great simplicity by summarizing the uniform feature from the four dierent unbalanced cases. It is expressive over the traditional tree rotation approach, that even in programming languages which don't support pattern matching, the algorithm can still be implemented by manually check the pattern. A Scheme/Lisp program is available along with this book can be referenced as an example. The insertion algorithm takes O(lgN) time to insert a key to a red-black tree which has N nodes. Exercise 3.3 Write a program in an imperative language, such as C, C++ or python to realize the same algorithm in this section. Note that, because there is no language supported pattern matching, you need to test the 4 dierent cases manually.

218. 3.4. DELETION 69 3.4 Deletion Remind the deletion section in binary search tree. Deletion is `imperative only' for red-black tree as well. In typically practice, it often builds the tree just one time, and performs looking up frequently after that. Okasaki explained why he didn't provide red-black tree deletion in his work in [3]. One reason is that deletions are much messier than insertions. The purpose of this section is just to show that red-black tree deletion is possible in purely functional settings, although it actually rebuilds the tree because trees are read only in terms of purely functional data structure. In real world, it's up to the user (or actually the programmer) to adopt the proper solution. One option is to mark the node be deleted with a ag, and perform a tree rebuilding when the number of deleted nodes exceeds 50% of the total number of nodes. Not only in functional settings, even in imperative settings, deletion is more complex than insertion. We face more cases to

219. x. Deletion may also violate the red black tree properties, so we need

220. x it after the normal deletion as described in binary search tree. The deletion algorithm in this book are based on top of a handout in [5]. The problem only happens if you try to delete a black node, because it will violate the 4-th property of red-black tree, which means the number of black node in the path may decreased so that it is not uniformed black-height any more. When delete a black node, we can resume red-black property number 4 by introducing a 'doubly-black' concept[2]. It means that the although the node is deleted, the blackness is kept by storing it in the parent node. If the parent node is red, it turns to black, However, if it has been already black, it turns to `doubly-black'. In order to express the 'doubly-black node', The de

221. nition need some mod-i

222. cation accordingly. data Color = R j B j BB -- BB: doubly black for deletion data RBTree a = Empty j BBEmpty -- doubly black empty j Node Color (RBTree a) a (RBTree a) When deleting a node, we

223. rst perform the same deleting algorithm in bi-nary search tree mentioned in previous chapter. After that, if the node to be sliced out is black, we need

224. x the tree to keep the red-black properties. Let's denote the empty tree as , and for non-empty tree, it can be decomposed to node(Color; L;Key;R) for its color, left sub-tree, key and the right sub-tree. The delete function is de

225. ned as the following. delete(T; k) = blackenRoot(del(T; k)) (3.7)

226. 70CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT where del(T; k) = 8 : : T = fixBlack2(node(C; del(L; k);Key;R)) : k Key fixBlack2(node(C; L;Key; del(R; k))) : k Key = mkBlk(R) : C = B R : otherwise : L = = mkBlk(L) : C = B L : otherwise : R = fixBlack2(node(C; L; k0; del(R; k0))) : otherwise (3.8) The real deleting happens inside function del. For the trivial case, that the tree is empty, the deletion result is ; If the key to be deleted is less than the key of the current node, we recursively perform deletion on its left sub-tree; if it is bigger than the key of the current node, then we recursively delete the key from the right sub-tree; Because it may bring doubly-blackness, so we need

227. x it. If the key to be deleted is equal to the key of the current node, we need splice it out. If one of its children is empty, we just replace the node by the other one and reserve the blackness of this node. otherwise we cut and past the minimum element k0 = min(R) from the right sub-tree. Function delete just forces the result tree of del to have a black root. This is realized by function blackenRoot. blackenRoot(T) = : T = node(B; L;Key;R) : otherwise (3.9) Compare with the makeBlack function, which is de

228. ned in red-black tree insertion section, they are almost same, except for the case of empty tree. This is only valid in deletion, because insertion can't result an empty tree, while deletion may. Function mkBlk is de

229. ned to reserved the blackness of a node. If the node to be sliced isn't black, this function won't be applied, otherwise, it turns a red node to black and turns a black node to doubly-black. This function also marks an empty tree to doubly-black empty. mkBlk(T) = 8 : : T = node(B; L;Key;R) : C = R node(B2; L;Key;R) : C = B T : otherwise (3.10) where means doubly-black empty node and B2 is the doubly-black color. Summarizing the above functions yields the following Haskell program. delete::(Ord a))RBTree a ! a ! RBTree a delete t x = blackenRoot(del t x) where del Empty _ = Empty del (Node color l k r) x j x k = fixDB color (del l x) k r j x k = fixDB color l k (del r x) -- x == k, delete this node j isEmpty l = if color==B then makeBlack r else r

230. 3.4. DELETION 71 j isEmpty r = if color==B then makeBlack l else l j otherwise = fixDB color l k' (del r k') where k'= min r blackenRoot (Node _ l k r) = Node B l k r blackenRoot _ = Empty makeBlack::RBTree a ! RBTree a makeBlack (Node B l k r) = Node BB l k r -- doubly black makeBlack (Node _ l k r) = Node B l k r makeBlack Empty = BBEmpty makeBlack t = t The

231. nal attack to the red-black tree deletion algorithm is to realize the fixBlack2 function. The purpose of this function is to eliminate the `doubly-black' colored node by rotation and color changing. Let's solve the doubly-black empty node

232. rst. For any node, If one of its child is doubly-black empty, and the other child is non-empty, we can safely replace the doubly-black empty with a normal empty node. Like

233. gure 3.7, if we are going to delete the node 4 from the tree (Instead show the whole tree, only part of the tree is shown), the program will use a doubly-black empty node to replace node 4. In the

234. gure, the doubly-black node is shown in black circle with 2 edges. It means that for node 5, it has a doubly-black empty left child and has a right non-empty child (a leaf node with key 6). In such case we can safely change the doubly-black empty to normal empty node. which won't violate any red-black properties. 3 2 5 1 4 6 (a) Delete 4 from the tree. 3 2 5 1 NIL 6 (b) After 4 is sliced o, it is doubly-black empty. 3 2 5 1 6 (c) We can safely change it to normal NIL. Figure 3.7: One child is doubly-black empty node, the other child is non-empty On the other hand, if a node has a doubly-black empty node and the other child is empty, we have to push the doubly-blackness up one level. For example

235. 72CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT in

236. gure 3.8, if we want to delete node 1 from the tree, the program will use a doubly-black empty node to replace 1. Then node 2 has a doubly-black empty node and has an empty right node. In such case we must mark node 2 as doubly-black after change its left child back to empty. 3 2 5 1 4 6 (a) Delete 1 from the tree. 3 2 5 NIL 4 6 (b) After 1 is sliced o, it is doubly-black empty. 3 2 5 4 6 (c) We must push the doubly-blackness up to node 2. Figure 3.8: One child is doubly-black empty node, the other child is empty. Based on above analysis, in order to

237. x the doubly-black empty node, we de

238. ne the function partially like the following. fixBlack2(T) = 8 : node(B2; ;Key; ) : (L = ^ R = ) _ (L = ^ R = ) node(C; ;Key;R) : L = ^ R6= node(C; L;Key; ) : R = ^ L6= ::: : ::: (3.11) After dealing with doubly-black empty node, we need to

239. x the case that the sibling of the doubly-black node is black and it has one red child. In this situation, we can

240. x the doubly-blackness with one rotation. Actually there are 4 dierent sub-cases, all of them can be transformed to one uniformed pattern. They are shown in the

241. gure 3.9. These cases are described in [2] as case 3 and case 4.

242. 3.4. DELETION 73 Figure 3.9: Fix the doubly black by rotation, the sibling of the doubly-black node is black, and it has one red child.

243. 74CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT The handling of these 4 sub-cases can be de

244. ned on top of formula 3.11. fixBlack2(T) = 8 : ::: : ::: node(C; node(B;mkBlk(A); x;B); y; node(B;C; z;D)) : p1:1 node(C; node(B; A; x;B); y; node(B;C; z;mkBlk(D))) : p1:2 ::: : ::: (3.12) where p1:1 and p1:2 each represent 2 patterns as the following. p1:1 = 8 : node(C; A; x; node(B; node(R;B; y;C); z;D)) ^ Color(A) = B2 _ node(C; A; x; node(B;B; y; node(R;C; z;D))) ^ Color(A) = B2 9= ; p1:2 = 8 : node(C; node(B; A; x; node(R;B; y;C)); z;D) ^ Color(D) = B2 _ node(C; node(B; node(R; A; x;B); y;C); z;D) ^ Color(D) = B2 9= ; Besides the above cases, there is another one that not only the sibling of the doubly-black node is black, but also its two children are black. We can change the color of the sibling node to red; resume the doubly-black node to black and propagate the doubly-blackness one level up to the parent node as shown in

245. gure 3.10. Note that there are two symmetric sub-cases. This case is described in [2] as case 2. We go on adding this

246. xing after formula 3.12. fixBlack2(T) = 8 : ::: : ::: mkBlk(node(C;mkBlk(A); x; node(R;B; y;C))) : p2:1 mkBlk(node(C; node(R; A; x;B); y;mkBlk(C))) : p2:2 ::: : ::: (3.13) where p2:1 and p2:2 are two patterns as below. p2:1 = node(C; A; x; node(B;B; y;C))^ Color(A) = B2 ^ Color(B) = Color(C) = B p2:2 = node(C; node(B; A; x;B); y;C)^ Color(C) = B2 ^ Color(A) = Color(B) = B There is a

247. nal case left, that the sibling of the doubly-black node is red. We can do a rotation to change this case to pattern p1:1 or p1:2. Figure 3.11 shows about it. We can

248. nish formula 3.13 with 3.14. fixBlack2(T) = 8 : ::: : ::: fixBlack2(B; fixBlack2(node(R; A; x;B); y;C) : p3:1 fixBlack2(B; A; x; fixBlack2(node(R;B; y;C)) : p3:2 T : otherwise (3.14)

249. 3.4. DELETION 75 =) x a y b c (a) Color of x can be either black or red. x a y b c (b) If x was red, then it becomes black, otherwise, it becomes doubly-black. =) y x c a b (c) Color of y can be either black or red. y x c a b (d) If y was red, then it becomes black, otherwise, it becomes doubly-black. Figure 3.10: propagate the blackness up. Figure 3.11: The sibling of the doubly-black node is red.

250. 76CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT where p3:1 and p3:2 are two patterns as the following. p3:1 = fColor(T) = B ^ Color(L) = B2 ^ Color(R) = Rg p3:2 = fColor(T) = B ^ Color(L) = R ^ Color(R) = B2g This two cases are described in [2] as case 1. Fixing the doubly-black node with all above dierent cases is a recursive function. There are two termination conditions. One contains pattern p1:1 and p1:2, The doubly-black node was eliminated. The other cases may continuously propagate the doubly-blackness from bottom to top till the root. Finally the algorithm marks the root node as black anyway. The doubly-blackness will be removed. Put formula 3.11, 3.12, 3.13, and 3.14 together, we can write the

251. nal Haskell program. fixDB::Color ! RBTree a ! a ! RBTree a ! RBTree a fixDB color BBEmpty k Empty = Node BB Empty k Empty fixDB color BBEmpty k r = Node color Empty k r fixDB color Empty k BBEmpty = Node BB Empty k Empty fixDB color l k BBEmpty = Node color l k Empty -- the sibling is black, and it has one red child fixDB color a@(Node BB _ _ _) x (Node B (Node R b y c) z d) = Node color (Node B (makeBlack a) x b) y (Node B c z d) fixDB color a@(Node BB _ _ _) x (Node B b y (Node R c z d)) = Node color (Node B (makeBlack a) x b) y (Node B c z d) fixDB color (Node B a x (Node R b y c)) z d@(Node BB _ _ _) = Node color (Node B a x b) y (Node B c z (makeBlack d)) fixDB color (Node B (Node R a x b) y c) z d@(Node BB _ _ _) = Node color (Node B a x b) y (Node B c z (makeBlack d)) -- the sibling and its 2 children are all black, propagate the blackness up fixDB color a@(Node BB _ _ _) x (Node B b@(Node B _ _ _) y c@(Node B _ _ _)) = makeBlack (Node color (makeBlack a) x (Node R b y c)) fixDB color (Node B a@(Node B _ _ _) x b@(Node B _ _ _)) y c@(Node BB _ _ _) = makeBlack (Node color (Node R a x b) y (makeBlack c)) -- the sibling is red fixDB B a@(Node BB _ _ _) x (Node R b y c) = fixDB B (fixDB R a x b) y c fixDB B (Node R a x b) y c@(Node BB _ _ _) = fixDB B a x (fixDB R b y c) -- otherwise fixDB color l k r = Node color l k r The deletion algorithm takes O(lgN) time to delete a key from a red-black tree with N nodes. Exercise 3.4 As we mentioned in this section, deletion can be implemented by just marking the node as deleted without actually removing it. Once the num-ber of marked nodes exceeds 50% of the total node number, a tree re-build is performed. Try to implement this method in your favorite programming language.

252. 3.5. IMPERATIVE RED-BLACK TREE ALGORITHM ? 77 3.5 Imperative red-black tree algorithm ? We almost

253. nished all the content in this chapter. By induction the patterns, we can implement the red-black tree in a simple way compare to the imperative tree rotation solution. However, we should show the comparator for completeness. For insertion, the basic idea is to use the similar algorithm as described in binary search tree. And then

254. x the balance problem by rotation and return the

255. nal result. 1: function Insert(T; k) 2: root T 3: x Create-Leaf(k) 4: Color(x) RED 5: parent NIL 6: while T6= NIL do 7: parent T 8: if k Key(T) then 9: T Left(T) 10: else 11: T Right(T) 12: Parent(x) parent 13: if parent = NIL then . tree T is empty 14: return x 15: else if k Key(parent) then 16: Left(parent) x 17: else 18: Right(parent) x 19: return Insert-Fix(root; x) The only dierence from the binary search tree insertion algorithm is that we set the color of the new node as red, and perform

256. xing before return. It is easy to translate the pseudo code to real imperative programming language, for instance Python 3. def rb_insert(t, key): root = t x = Node(key) parent = None while(t): parent = t if(key t.key): t = t.left else: t = t.right if parent is None: #tree is empty root = x elif key parent.key: parent.set_left(x) else: parent.set_right(x) return rb_insert_fix(root, x) 3C, and C++ source codes are available along with this book

257. 78CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT There are 3 base cases for

258. xing, and if we take the left-right symmetric into consideration. there are total 6 cases. Among them two cases can be merged together, because they all have uncle node in red color, we can toggle the parent color and uncle color to black and set grand parent color to red. With this merging, the

259. xing algorithm can be realized as the following. 1: function Insert-Fix(T; x) 2: while Parent(x)6= NIL and Color(Parent(x)) = RED do 3: if Color(Uncle(x)) = RED then . Case 1, x's uncle is red 4: Color(Parent(x)) BLACK 5: Color(Grand-Parent(x)) RED 6: Color(Uncle(x)) BLACK 7: x Grandparent(x) 8: else . x's uncle is black 9: if Parent(x) = Left(Grand-Parent(x)) then 10: if x = Right(Parent(x)) then . Case 2, x is a right child 11: x Parent(x) 12: T Left-Rotate(T; x) . Case 3, x is a left child 13: Color(Parent(x)) BLACK 14: Color(Grand-Parent(x)) RED 15: T Right-Rotate(T, Grand-Parent(x)) 16: else 17: if x = Left(Parent(x)) then . Case 2, Symmetric 18: x Parent(x) 19: T Right-Rotate(T; x) . Case 3, Symmetric 20: Color(Parent(x)) BLACK 21: Color(Grand-Parent(x)) RED 22: T Left-Rotate(T, Grand-Parent(x)) 23: Color(T) BLACK 24: return T This program takes O(lgN) time to insert a new key to the red-black tree. Compare this pseudo code and the balance function we de

260. ned in previous section, we can see the dierence. They dier not only in terms of simplicity, but also in logic. Even if we feed the same series of keys to the two algorithms, they may build dierent red-black trees. There is a bit performance overhead in the pattern matching algorithm. Okasaki discussed about the dierence in detail in his paper[2]. Translate the above algorithm to Python yields the below program. # Fix the red!red violation def rb_insert_fix(t, x): while(x.parent and x.parent.color==RED): if x.uncle().color == RED: #case 1: ((a:R x:R b) y:B c:R) =) ((a:R x:B b) y:R c:B) set_color([x.parent, x.grandparent(), x.uncle()], [BLACK, RED, BLACK]) x = x.grandparent() else: if x.parent == x.grandparent().left: if x == x.parent.right:

261. 3.6. MORE WORDS 79 #case 2: ((a x:R b:R) y:B c) =) case 3 x = x.parent t=left_rotate(t, x) # case 3: ((a:R x:R b) y:B c) =) (a:R x:B (b y:R c)) set_color([x.parent, x.grandparent()], [BLACK, RED]) t=right_rotate(t, x.grandparent()) else: if x == x.parent.left: #case 2': (a x:B (b:R y:R c)) =) case 3' x = x.parent t = right_rotate(t, x) # case 3': (a x:B (b y:R c:R)) =) ((a x:R b) y:B c:R) set_color([x.parent, x.grandparent()], [BLACK, RED]) t=left_rotate(t, x.grandparent()) t.color = BLACK return t Figure 3.12 shows the results of feeding same series of keys to the above python insertion program. Compare them with

262. gure 3.6, one can tell the dierence clearly. 1 1 2 1 4 1 7 5 8 1 5 (a) 5 2 7 1 4 3 6 9 8 (b) Figure 3.12: Red-black trees created by imperative algorithm. We skip the red-black tree delete algorithm in imperative settings, because it is even more complex than the insertion. The implementation of deleting is left as an exercise of this chapter. Exercise 3.5 Implement the red-black tree deleting algorithm in your favorite impera-tive programming language. you can refer to [2] for algorithm details. 3.6 More words Red-black tree is the most popular implementation of balanced binary search tree. Another one is the AVL tree, which we'll introduce in next chapter. Red-black tree can be a good start point for more data structures. If we extend the

263. 80CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT number of children from 2 to K, and keep the balance as well, it leads to B-tree, If we store the data along with edge but not inside node, it leads to Tries. However, the multiple cases handling and the long program tends to make new comers think red-black tree is complex. Okasaki's work helps making the red-black tree much easily understand. There are many implementation in other programming languages in that manner [7]. It's also inspired me to

264. nd the pattern matching solution for Splay tree and AVL tree etc.

265. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [2] Chris Okasaki. FUNCTIONAL PEARLS Red-Black Trees in a Functional Setting. J. Functional Programming. 1998 [3] Chris Okasaki. Ten Years of Purely Functional Data Structures. http://okasaki.blogspot.com/2008/02/ten-years-of-purely-functional-data. html [4] Wikipedia. Red-black tree. http://en.wikipedia.org/wiki/Red-black tree [5] Lyn Turbak. Red-Black Trees. cs.wellesley.edu/ cs231/fall01/red-black. pdf Nov. 2, 2001. [6] SGI STL. http://www.sgi.com/tech/stl/ [7] Pattern matching. http://rosettacode.org/wiki/Pattern matching 81

266. 82 AVL tree

267. Chapter 4 AVL tree 4.1 Introduction 4.1.1 How to measure the balance of a tree? Besides red-black tree, are there any other intuitive solutions of self-balancing binary search tree? In order to measure how balancing a binary search tree, one idea is to compare the height of the left sub-tree and right sub-tree. If they diers a lot, the tree isn't well balanced. Let's denote the dierence height between two children as below (T) = jLj jRj (4.1) Where jTj means the height of tree T, and L, R denotes the left sub-tree and right sub-tree. If (T) = 0, The tree is de

268. nitely balanced. For example, a complete binary tree has N = 2h 1 nodes for height h. There is no empty branches unless the leafs. Another trivial case is empty tree. () = 0. The less absolute value of (T) the more balancing the tree is. We de

269. ne (T) as the balance factor of a binary search tree. 4.2 De

270. nition of AVL tree An AVL tree is a special binary search tree, that all sub-trees satisfying the following criteria. j(T)j 1 (4.2) The absolute value of balance factor is less than or equal to 1, which means there are only three valid values, -1, 0 and 1. Figure 4.1 shows an example AVL tree. Why AVL tree can keep the tree balanced? In other words, Can this de

271. - nition ensure the height of the tree as O(lgN) where N is the number of the nodes in the tree? Let's prove this fact. For an AVL tree of height h, The number of nodes varies. It can have at most 2h 1 nodes for a complete binary tree. We are interesting about how 83

272. 84 CHAPTER 4. AVL TREE 4 2 8 1 3 6 9 5 7 1 0 Figure 4.1: An example AVL tree many nodes there are at least. Let's denote the minimum number of nodes for height h AVL tree as N(h). It's obvious for the trivial cases as below. For empty tree, h = 0, N(0) = 0; For a singleton root, h = 1, N(1) = 1; What's the situation for common case N(h)? Figure 4.2 shows an AVL tree T of height h. It contains three part, the root node, and two sub trees A;B. We have the following fact. h = max(height(L); height(R)) + 1 (4.3) We immediately know that, there must be one child has height h 1. Let's say height(A) = h 1. According to the de

273. nition of AVL tree, we have. jheight(A) height(B)j 1. This leads to the fact that the height of other tree B can't be lower than h 2, So the total number of the nodes of T is the number of nodes in tree A, and B plus 1 (for the root node). We exclaim that. N(h) = N(h 1) + N(h 2) + 1 (4.4) This recursion reminds us the famous Fibonacci series. Actually we can transform it to Fibonacci series by de

274. ning N0(h) = N(h) + 1. So equation 4.4 changes to. N 0 (h) = N 0 (h 1) + N 0 (h 2) (4.5) Lemma 4.2.1. Let N(h) be the minimum number of nodes for an AVL tree with height h. and N0(h) = N(h) + 1, then N 0 (h) h (4.6) Where = p 5+1 2 is the golden ratio. Proof. For the trivial case, we have

275. 4.2. DEFINITION OF AVL TREE 85 k h-1 h-2 Figure 4.2: An AVL tree with height h, one of the sub-tree with height h 1, the other is h 2 h = 0, N0(0) = 1 0 = 1 h = 1, N0(1) = 2 1 = 1:618::: For the induction case, suppose N0(h) h. N0(h + 1) = N0(h) + N0(h 1) fFibonaccig h + h1 = h1( + 1) f + 1 = 2 = p 5+3 2 g = h+1 From Lemma 4.2.1, we immediately get h log(N + 1) = log(2) lg(N + 1) 1:44 lg(N + 1) (4.7) It tells that the height of AVL tree is proportion to O(lgN), which means that AVL tree is balanced. During the basic mutable tree operations such as insertion and deletion, if the balance factor changes to any invalid value, some

276. xing has to be performed to resume jj within 1. Most implementations utilize tree rotations. In this chapter, we'll show the pattern matching solution which is inspired by Okasaki's red-black tree solution[2]. Because of this modify-

277. xing approach, AVL tree is also a kind of self-balancing binary search tree. For comparison purpose, we'll also show the procedural algorithms. Of course we can compute the value recursively, another option is to store the balance factor inside each nodes, and update them when we modify the tree. The latter one avoid computing the same value every time. Based on this idea, we can add one data

278. eld to the original binary search tree as the following C++ code example 1. template class T struct node{ int delta; T key; node left; 1Some implementations store the height of a tree instead of as in [5]

279. 86 CHAPTER 4. AVL TREE node right; node parent; }; In purely functional setting, some implementation use dierent constructor to store the information. for example in [1], there are 4 constructors, E, N, P, Z de

280. ned. E for empty tree, N for tree with negative 1 balance factor, P for tree with positive 1 balance factor and Z for zero case. In this chapter, we'll explicitly store the balance factor inside the node. data AVLTree a = Empty j Br (AVLTree a) a (AVLTree a) Int The immutable operations, including looking up,

281. nding the maximum and minimum elements are all same as the binary search tree. We'll skip them and focus on the mutable operations. 4.3 Insertion Insert a new element to an AVL tree may violate the AVL tree property that the absolute value exceeds 1. To resume it, one option is to do the tree rotation according to the dierent insertion cases. Most implementation is based on this approach Another way is to use the similar pattern matching method mentioned by Okasaki in his red-black tree implementation [2]. Inspired by this idea, it is possible to provide a simple and intuitive solution. When insert a new key to the AVL tree, the balance factor of the root may changes in range [1; 1], and the height may increase at most by one, which we need recursively use this information to update the value in upper level nodes. We can de

282. ne the result of the insertion algorithm as a pair of data (T0;H). Where T0 is the new tree and H is the increment of height. Let's denote function first(pair) which can return the

283. rst element in a pair. We can modify the binary search tree insertion algorithm as the following to handle AVL tree. insert(T; k) = first(ins(T; k)) (4.8) where ins(T; k) = 8 : (node(; k; ; 0); 1) : T = (tree(ins(L; k);Key; (R; 0)); ) : k Key (tree((L; 0);Key; ins(R; k)); ) : otherwise (4.9) L;R;Key; represent the left child, right child, the key and the balance factor of a tree. L = left(T) R = right(T) Key = key(T) = (T)

284. 4.3. INSERTION 87 When we insert a new key k to a AVL tree T, if the tree is empty, we just need create a leaf node with k, set the balance factor as 0, and the height is increased by one. This is the trivial case. Function node() is de

285. ned to build a tree by taking a left sub-tree, a right sub-tree, a key and a balance factor. If T isn't empty, we need compare the Key with k. If k is less than the key, we recursively insert it to the left child, otherwise we insert it into the right child. As we de

286. ned above, the result of the recursive insertion is a pair like (L0;Hl), we need do balancing adjustment as well as updating the increment of height. Function tree() is de

287. ned to dealing with this task. It takes 4 pa-rameters as (L0;Hl), Key, (R0;Hr), and . The result of this function is de

288. ned as (T0;H), where T0 is the new tree after adjustment, and H is the new increment of height which is de

289. ned as H = jT 0j jTj (4.10) This can be further detailed deduced in 4 cases. H = jT0j jTj = 1 + max(jR0j; jL0j) (1 + max(jRj; jLj)) = max(jR0j; jL0j) max(jRj; jLj) = 8 : Hr : 0 ^ 0 0 + Hr : 0 ^ 0 0 Hl : 0 ^ 0 0 Hl : otherwise (4.11) To prove this equation, note the fact that the height can't increase both in left and right with only one insertion. These 4 cases can be explained from the de

290. nition of balance factor de

291. nition that it equal to the dierence from the right sub tree and left sub tree. If 0 and 0 0, it means that the height of right sub tree isn't less than the height of left sub tree both before insertion and after insertion. In this case, the increment in height of the tree is only `contributed' from the right sub tree, which is Hr. If 0, which means the height of left sub tree isn't less than the height of right sub tree before, and it becomes 0 0, which means that the height of right sub tree increases due to insertion, and the left side keeps same (jL0j = jLj). So the increment in height is H = max(jR0j; jL0j) max(jRj; jLj) f 0 ^ 0 0g = jR0j jL0j fjLj = jL0jg = jRj + Hr jLj = + Hr For the case 0 ^ 0 0, Similar as the second one, we can get. H = max(jR0j; jL0j) max(jRj; jLj) f 0 ^ 0 0g = jL0j jRj = jLj + Hl jRj = Hl

292. 88 CHAPTER 4. AVL TREE For the last case, the both and 0 is no bigger than zero, which means the height left sub tree is always greater than or equal to the right sub tree, so the increment in height is only `contributed' from the right sub tree, which is Hl. The next problem in front of us is how to determine the new balancing factor value 0 before performing balancing adjustment. According to the de

293. nition of AVL tree, the balancing factor is the height of right sub tree minus the height of right sub tree. We have the following facts. 0 = jR0j jL0j = jRj + Hr (jLj + Hl) = jRj jLj + Hr Hl = + Hr Hl (4.12) With all these changes in height and balancing factor get clear, it's possible to de

294. ne the tree() function mentioned in (4.9). 0 ;Hl);Key; (R tree((L 0 ;Hr); ) = balance(node(L 0 ;Key;R 0 ; 0 );H) (4.13) Before we moving into details of balancing adjustment, let's translate the above equations to real programs in Haskell. First is the insert function. insert::(Ord a))AVLTree a ! a ! AVLTree a insert t x = fst $ ins t where ins Empty = (Br Empty x Empty 0, 1) ins (Br l k r d) j x k = tree (ins l) k (r, 0) d j x == k = (Br l k r d, 0) j otherwise = tree (l, 0) k (ins r) d Here we also handle the case that inserting a duplicated key (which means the key has already existed.) as just overwriting. tree::(AVLTree a, Int) ! a ! (AVLTree a, Int) ! Int ! (AVLTree a, Int) tree (l, dl) k (r, dr) d = balance (Br l k r d', delta) where d' = d + dr - dl delta = deltaH d d' dl dr And the de

295. nition of height increment is as below. deltaH :: Int ! Int ! Int ! Int ! Int deltaH d d' dl dr j d 0 d' 0 = dr j d 0 d' 0 = d+dr j d 0 d' 0 = dl - d j otherwise = dl 4.3.1 Balancing adjustment As the pattern matching approach is adopted in doing re-balancing. We need consider what kind of patterns violate the AVL tree property.

296. 4.3. INSERTION 89 Figure 4.3 shows the 4 cases which need

297. x. For all these 4 cases the bal-ancing factors are either -2, or +2 which exceed the range of [1; 1]. After balancing adjustment, this factor turns to be 0, which means the height of left sub tree is equal to the right sub tree. @ @ @R @ @I @ (z) = 2 (y) = 1 (x) = 2 (y) = 1 (z) = 2 (x) = 1 (x) = 2 (z) = 1 0(y) = 0 z y D x C A B z x D A y B C x A y B z C D x A z y D B C y x z A B C D Figure 4.3: 4 cases for balancing a AVL tree after insertion We call these four cases left-left lean, right-right lean, right-left lean, and left-right lean cases in clock-wise direction from top-left. We denote the balancing factor before

298. xing as (x); (y), and (z), while after

299. xing, they changes to 0(x); 0(y), and 0(z) respectively. We'll next prove that, after

300. xing, we have (y) = 0 for all four cases, and we'll provide the result values of 0(x) and 0(z).

301. 90 CHAPTER 4. AVL TREE Left-left lean case As the structure of sub tree x doesn't change due to

302. xing, we immediately get 0(x) = (x). Since (y) = 1 and (z) = 2, we have (y) = jCj jxj = 1 ) jCj = jxj 1 (z) = jDj jyj = 2 ) jDj = jyj 2 (4.14) After

303. xing. 0(z) = jDj jCj fFrom(4:14)g = jyj 2 (jxj 1) = jyj jxj 1 fx is child of y ) jyj jxj = 1g = 0 (4.15) For 0(y), we have the following fact after

304. xing. 0(y) = jzj jxj = 1 + max(jCj; jDj) jxj fBy (4.15), we havejCj = jDjg = 1 + jCj jxj fBy (4.14)g = 1 + jxj 1 jxj = 0 (4.16) Summarize the above results, the left-left lean case adjust the balancing factors as the following. 0(x) = (x) 0(y) = 0 0(z) = 0 (4.17) Right-right lean case Since right-right case is symmetric to left-left case, we can easily achieve the result balancing factors as 0(x) = 0 0(y) = 0 0(z) = (z) (4.18) Right-left lean case First let's consider 0(x). After balance

305. xing, we have. 0 (x) = jBj jAj (4.19) Before

306. xing, if we calculate the height of z, we can get. jzj = 1 + max(jyj; jDj) f(z) = 1 ) jyj jDjg = 1 + jyj = 2 + max(jBj; jCj) (4.20)

307. 4.3. INSERTION 91 While since (x) = 2, we can deduce that. (x) = 2 ) jzj jAj = 2 fBy (4.20)g ) 2 + max(jBj; jCj) jAj = 2 ) max(jBj; jCj) jAj = 0 (4.21) If (y) = 1, which means jCj jBj = 1, it means max(jBj; jCj) = jCj = jBj + 1 (4.22) Take this into (4.21) yields jBj + 1 jAj = 0 ) jBj jAj = 1 fBy (4.19) g ) 0(x) = 1 (4.23) If (y)6= 1, it means max(jBj; jCj) = jBj, taking this into (4.21), yields. jBj jAj = 0 fBy (4.19)g ) 0(x) = 0 (4.24) Summarize these 2 cases, we get relationship of 0(x) and (y) as the follow-ing. 0 (x) = 1 : (y) = 1 0 : otherwise (4.25) For 0(z) according to de

308. nition, it is equal to. 0(z) = jDj jCj f(z) = 1 = jDj jyjg = jyj jCj 1 fjyj = 1 + max(jBj; jCj)g = max(jBj; jCj) jCj (4.26) If (y) = 1, then we have jCjjBj = 1, so max(jBj; jCj) = jBj = jCj+1. Takes this into (4.26), we get 0(z) = 1. If (y)6= 1, then max(jBj; jCj) = jCj, we get 0(z) = 0. Combined these two cases, the relationship between 0(z) and (y) is as below. 0 (z) = 1 : (y) = 1 0 : otherwise (4.27) Finally, for 0(y), we deduce it like below. 0(y) = jzj jxj = max(jCj; jDj) max(jAj; jBj) (4.28) There are three cases. If (y) = 0, it means jBj = jCj, and according to (4.25) and (4.27), we have 0(x) = 0 ) jAj = jBj, and 0(z) = 0 ) jCj = jDj, these lead to 0(y) = 0.

309. 92 CHAPTER 4. AVL TREE If (y) = 1, From (4.27), we have 0(z) = 0 ) jCj = jDj. 0(y) = max(jCj; jDj) max(jAj; jBj) fjCj = jDjg = jCj max(jAj; jBj) fFrom (4.25): 0(x) = 1 ) jBj jAj = 1g = jCj (jBj + 1) f(y) = 1 ) jCj jBj = 1g = 0 If (y) = 1, From (4.25), we have 0(x) = 0 ) jAj = jBj. 0(y) = max(jCj; jDj) max(jAj; jBj) fjAj = jBjg = max(jCj; jDj) jBj fFrom (4.27): jDj jCj = 1g = jCj + 1 jBj f(y) = 1 ) jCj jBj = 1g = 0 Both three cases lead to the same result that 0(y) = 0. Collect all the above results, we get the new balancing factors after

310. xing as the following. 0(x) = 1 : (y) = 1 0 : otherwise 0(y) = 0 0(z) = 1 : (y) = 1 0 : otherwise (4.29) Left-right lean case Left-right lean case is symmetric to the Right-left lean case. By using the similar deduction, we can

311. nd the new balancing factors are identical to the result in (4.29). 4.3.2 Pattern Matching All the problems have been solved and it's time to de

312. ne the

313. nal pattern matching

314. xing function. balance(T;H) = 8 : (node(node(A; x;B; (x)); y; node(C; z;D; 0); 0); 0) : Pll(T) (node(node(A; x;B; 0); y; node(C; z;D; (z)); 0); 0) : Prr(T) (node(node(A; x;B; 0(x)); y; node(C; z;D; 0(z)); 0); 0) : Prl(T) _ Plr(T) (T;H) : otherwise (4.30) Where Pll(T) means the pattern of tree T is left-left lean respectively. 0(x) and delta0(z) are de

315. ned in (4.29). The four patterns are tested as below. Pll(T) = node(node(node(A; x;B; (x)); y;C;1); z;D;2) Prr(T) = node(A; x; node(B; y; node(C; z;D; (z)); 1); 2) Prl(T) = node(node(A; x; node(B; y;C; (y)); 1); z;D;2) Plr(T) = node(A; x; node(node(B; y;C; (y)); z;D;1); 2) (4.31) Translating the above function de

316. nition to Haskell yields a simple and in-tuitive program.

317. 4.3. INSERTION 93 balance :: (AVLTree a, Int) ! (AVLTree a, Int) balance (Br (Br (Br a x b dx) y c (-1)) z d (-2), _) = (Br (Br a x b dx) y (Br c z d 0) 0, 0) balance (Br a x (Br b y (Br c z d dz) 1) 2, _) = (Br (Br a x b 0) y (Br c z d dz) 0, 0) balance (Br (Br a x (Br b y c dy) 1) z d (-2), _) = (Br (Br a x b dx') y (Br c z d dz') 0, 0) where dx' = if dy == 1 then -1 else 0 dz' = if dy == -1 then 1 else 0 balance (Br a x (Br (Br b y c dy) z d (-1)) 2, _) = (Br (Br a x b dx') y (Br c z d dz') 0, 0) where dx' = if dy == 1 then -1 else 0 dz' = if dy == -1 then 1 else 0 balance (t, d) = (t, d) The insertion algorithm takes time proportion to the height of the tree, and according to the result we proved above, its performance is O(lgN) where N is the number of elements stored in the AVL tree. Veri

318. cation One can easily create a function to verify a tree is AVL tree. Actually we need verify two things,

319. rst, it's a binary search tree; second, it satis

320. es AVL tree property. We left the

321. rst veri

322. cation problem as an exercise to the reader. In order to test if a binary tree satis

323. es AVL tree property, we can test the dierence in height between its two children, and recursively test that both children conform to AVL property until we arrive at an empty leaf. avl?(T) = True : T = avl?(L) ^ avl?(R) ^ jjRj jLjj 1 : otherwise (4.32) And the height of a AVL tree can also be calculate from the de

324. nition. jTj = 0 : T = 1 + max(jRj; jLj) : otherwise (4.33) The corresponding Haskell program is given as the following. isAVL :: (AVLTree a) ! Bool isAVL Empty = True isAVL (Br l _ r d) = and [isAVL l, isAVL r, abs (height r - height l) 1] height :: (AVLTree a) ! Int height Empty = 0 height (Br l _ r _) = 1 + max (height l) (height r) Exercise 4.1 Write a program to verify a binary tree is a binary search tree in your favorite programming language. If you choose to use an imperative language, please consider realize this program without recursion.

325. 94 CHAPTER 4. AVL TREE 4.4 Deletion As we mentioned before, deletion doesn't make signi

326. cant sense in purely func-tional settings. As the tree is read only, it's typically performs frequently looking up after build. Even if we implement deletion, it's actually re-building the tree as we pre-sented in chapter of red-black tree. We left the deletion of AVL tree as an exercise to the reader. Exercise 4.2 Take red-black tree deletion algorithm as an example, write the AVL tree deletion program in purely functional approach in your favorite program-ming language. Write the deletion algorithm in imperative approach in your favorite pro-gramming language. 4.5 Imperative AVL tree algorithm ? We almost

327. nished all the content in this chapter about AVL tree. However, it necessary to show the traditional insert-and-rotate approach as the comparator to pattern matching algorithm. Similar as the imperative red-black tree algorithm, the strategy is

328. rst to do the insertion as same as for binary search tree, then

329. x the balance problem by rotation and return the

330. nal result. 1: function Insert(T; k) 2: root T 3: x Create-Leaf(k) 4: (x) 0 5: parent NIL 6: while T6= NIL do 7: parent T 8: if k Key(T) then 9: T Left(T) 10: else 11: T Right(T) 12: Parent(x) parent 13: if parent = NIL then . tree T is empty 14: return x 15: else if k Key(parent) then 16: Left(parent) x 17: else 18: Right(parent) x 19: return AVL-Insert-Fix(root; x) Note that after insertion, the height of the tree may increase, so that the balancing factor may also change, insert on right side will increase by 1, while insert on left side will decrease it. By the end of this algorithm, we need perform bottom-up

331. xing from node x towards root.

332. 4.5. IMPERATIVE AVL TREE ALGORITHM ? 95 We can translate the pseudo code to real programming language, such as Python 2. def avl_insert(t, key): root = t x = Node(key) parent = None while(t): parent = t if(key t.key): t = t.left else: t = t.right if parent is None: #tree is empty root = x elif key parent.key: parent.set_left(x) else: parent.set_right(x) return avl_insert_fix(root, x) This is a top-down algorithm search the tree from root down to the proper position and insert the new key as a leaf. By the end of this algorithm, it calls

333. xing procedure, by passing the root and the new node inserted. Note that we reuse the same methods of set left() and set right() as we de

334. ned in chapter of red-black tree. In order to resume the AVL tree balance property by

335. xing, we

336. rst deter-mine if the new node is inserted on left hand or right hand. If it is on left, the balancing factor decreases, otherwise it increases. If we denote the new value as 0, there are 3 cases of the relationship between and 0. If jj = 1 and j0j = 0, this means adding the new node makes the tree perfectly balanced, the height of the parent node doesn't change, the al-gorithm can be terminated. If jj = 0 and j0j = 1, it means that either the height of left sub tree or right sub tree increases, we need go on check the upper level of the tree. If jj = 1 and j0j = 2, it means the AVL tree property is violated due to the new insertion. We need perform rotation to

337. x it. 1: function AVL-Insert-Fix(T; x) 2: while Parent(x)6= NIL do 3: (Parent(x)) 4: if x = Left(Parent(x)) then 5: 0 1 6: else 7: 0 + 1 8: (Parent(x)) 0 9: P Parent(x) 10: L Left(x) 11: R Right(x) 2C and C++ source code are available along with this book

338. 96 CHAPTER 4. AVL TREE 12: if jj = 1 and j0j = 0 then . Height doesn't change, terminates. 13: return T 14: else if jj = 0 and j0j = 1 then . Go on bottom-up updating. 15: x P 16: else if jj = 1 and j0j = 2 then 17: if 0 = 2 then 18: if (R) = 1 then . Right-right case 19: (P) 0 . By (4.18) 20: (R) 0 21: T Left-Rotate(T; P) 22: if (R) = 1 then . Right-left case 23: y (Left(R)) . By (4.29) 24: if y = 1 then 25: (P) 1 26: else 27: (P) 0 28: (Left(R)) 0 29: if y = 1 then 30: (R) 1 31: else 32: (R) 0 33: T Right-Rotate(T;R) 34: T Left-Rotate(T; P) 35: if 0 = 2 then 36: if (L) = 1 then . Left-left case 37: (P) 0 38: (L) 0 39: Right-Rotate(T; P) 40: else . Left-Right case 41: y (Right(L)) 42: if y = 1 then 43: (L) 1 44: else 45: (L) 0 46: (Right(L)) 0 47: if y = 1 then 48: (P) 1 49: else 50: (P) 0 51: Left-Rotate(T;L) 52: Right-Rotate(T; P) 53: break 54: return T Here we reuse the rotation algorithms mentioned in red-black tree chapter. Rotation operation doesn't update balancing factor at all, However, since rotation changes (actually improves) the balance situation we should update these factors. Here we refer the results from above section. Among the four cases, right-right case and left-left case only need one rotation, while right-left

339. 4.5. IMPERATIVE AVL TREE ALGORITHM ? 97 case and left-right case need two rotations. The relative python program is shown as the following. def avl_insert_fix(t, x): while x.parent is not None: d2 = d1 = x.parent.delta if x == x.parent.left: d2 = d2 - 1 else: d2 = d2 + 1 x.parent.delta = d2 (p, l, r) = (x.parent, x.parent.left, x.parent.right) if abs(d1) == 1 and abs(d2) == 0: return t elif abs(d1) == 0 and abs(d2) == 1: x = x.parent elif abs(d1)==1 and abs(d2) == 2: if d2 == 2: if r.delta == 1: # Right-right case p.delta = 0 r.delta = 0 t = left_rotate(t, p) if r.delta == -1: # Right-Left case dy = r.left.delta if dy == 1: p.delta = -1 else: p.delta = 0 r.left.delta = 0 if dy == -1: r.delta = 1 else: r.delta = 0 t = right_rotate(t, r) t = left_rotate(t, p) if d2 == -2: if l.delta == -1: # Left-left case p.delta = 0 l.delta = 0 t = right_rotate(t, p) if l.delta == 1: # Left-right case dy = l.right.delta if dy == 1: l.delta = -1 else: l.delta = 0 l.right.delta = 0 if dy == -1: p.delta = 1 else: p.delta = 0 t = left_rotate(t, l) t = right_rotate(t, p) break return t

340. 98 CHAPTER 4. AVL TREE We skip the AVL tree deletion algorithm and left this as an exercise to the reader. 4.6 Chapter note AVL tree was invented in 1962 by Adelson-Velskii and Landis[3], [4]. The name AVL tree comes from the two inventor's name. It's earlier than red-black tree. It's very common to compare AVL tree and red-black tree, both are self-balancing binary search trees, and for all the major operations, they both con-sume O(lgN) time. From the result of (4.7), AVL tree is more rigidly balanced hence they are faster than red-black tree in looking up intensive applications [3]. However, red-black trees could perform better in frequently insertion and removal cases. Many popular self-balancing binary search tree libraries are implemented on top of red-black tree such as STL etc. However, AVL tree provides an intuitive and eective solution to the balance problem as well. After this chapter, we'll extend the tree data structure from storing data in node to storing information on edges, which leads to Trie and Patrica, etc. If we extend the number of children from two to more, we can get B-tree. These data structures will be introduced next.

341. Bibliography [1] Data.Tree.AVL http://hackage.haskell.org/packages/archive/AvlTree/4.2/doc/html/Data- Tree-AVL.html [2] Chris Okasaki. FUNCTIONAL PEARLS Red-Black Trees in a Functional Setting. J. Functional Programming. 1998 [3] Wikipedia. AVL tree. http://en.wikipedia.org/wiki/AVL tree [4] Guy Cousinear, Michel Mauny. The Functional Approach to Program-ming. Cambridge University Press; English Ed edition (October 29, 1998). ISBN-13: 978-0521576819 [5] Pavel Grafov. Implementation of an AVL tree in Python. http://github.com/pgrafov/python-avl-tree 99

342. 100 Trie and Patricia

343. Chapter 5 Trie and Patricia 5.1 Introduction The binary trees introduced so far store information in nodes. It's possible to store the information in edges. Trie and Patricia are important data structures in information retrieving and manipulating. They were invented in 1960s. And are widely used in compiler design[2], and bio-information area, such as DNA pattern matching [3]. 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 1 1011 Figure 5.1: Radix tree. Figure 5.1 shows a radix tree[2]. It contains the strings of bit 1011, 10, 011, 100 and 0. When searching a key k = (b0b1:::bn)2, we take the

344. rst bit b0 (MSB 101

345. 102 CHAPTER 5. TRIE AND PATRICIA from left), check if it is 0 or 1, if it is 0, we turn left; and turn right for 1. Then we take the second bit and repeat this search till either meet a leaf or

346. nish all n bits. The radix tree needn't store keys in node at all. The information is repre-sented by edges. The nodes marked with keys in the above

347. gure are only for illustration purpose. It is very natural to come to the idea `is it possible to represent key in integer instead of string?' Because integer can be written in binary format, such approach can save spaces. Another advantage is that the speed is fast because we can use bit-wise manipulation in most programming environment. 5.2 Integer Trie The data structure shown in

348. gure 5.1 is called as binary trie. Trie is invented by Edward Fredkin. It comes from retrieval, pronounce as /'tri:/ by the inventor, while it is pronounced /'trai/ try by other authors [5]. Trie is also called pre

349. x tree or radix tree. A binary trie is a special binary tree in which the placement of each key is controlled by its bits, each 0 means go left and each 1 means go right[2]. Because integers can be represented in binary format, it is possible to store integer keys rather than 0, 1 strings. When insert an integer as the new key to the trie, we change it to binary form, then examine the

350. rst bit, if it is 0, we recursively insert the rest bits to the left sub tree; otherwise if it is 1, we insert into the right sub tree. There is a problem when treat the key as integer. Consider a binary trie shown in

351. gure 5.2. If represented in 0, 1 strings, all the three keys are dierent. But they are identical when turn into integers. Where should we insert decimal 3, for example, to the trie? One approach is to treat all the pre

352. x zero as eective bits. Suppose the integer is represented with 32-bits, If we want to insert key 1, it ends up with a tree of 32 levels. There are 31 nodes, each only has the left sub tree. the last node only has the right sub tree. It is very inecient in terms of space. Okasaki shows a method to solve this problem in [2]. Instead of using big-endian integer, we can use the little-endian integer to represent key. Thus decimal integer 1 is represented as binary 1. Insert it to the empty binary trie, the result is a trie with a root and a right leaf. There is only 1 level. decimal 2 is represented as 01, and decimal 3 is (11)2 in little-endian binary format. There is no need to add any pre

353. x 0, the position in the trie is uniquely determined. 5.2.1 De

354. nition of integer Trie In order to de

355. ne the little-endian binary trie, we can reuse the structure of binary tree. A binary trie node is either empty, or a branch node. The branch node contains a left child, a right node, and optional value as satellite data. The left sub tree is encoded as 0 and the right sub tree is encoded as 1. The following example Haskell code de

356. nes the trie algebraic data type. data IntTrie a = Empty j Branch (IntTrie a) (Maybe a) (IntTrie a)

357. 5.2. INTEGER TRIE 103 0 1 0 1 1 1 0011 1 0 1 1 1 1 1 Figure 5.2: a big-endian trie Below is another example de

358. nition in Python. class IntTrie: def __init__(self): self.left = self.right = None self.value = None 5.2.2 Insertion Since the key is little-endian integer, when insert a key, we take the bit one by one from the right most (LSB). If it is 0, we go to the left, otherwise go to the right for 1. If the child is empty, we need create a new node, and repeat this to the last bit (MSB) of the key. 1: function Insert(T; k; v) 2: if T = NIL then 3: T Empty-Node 4: p T 5: while k6= 0 do 6: if Even?(k) then 7: if Left(p) = NIL then 8: Left(p) Empty-Node 9: p Left(p) 10: else 11: if Right(p) = NIL then 12: Right(p) Empty-Node 13: p Right(p)

359. 104 CHAPTER 5. TRIE AND PATRICIA 14: k bk=2c 15: Data(p) v 16: return T This algorithm takes 3 arguments, a Trie T, a key k, and the satellite date v. The following Python example code implements the insertion algorithm. The satellite data is optional, it is empty by default. def trie_insert(t, key, value = None): if t is None: t = IntTrie() p = t while key != 0: if key 1 == 0: if p.left is None: p.left = IntTrie() p = p.left else: if p.right is None: p.right = IntTrie() p = p.right key = key1 p.value = value return t Figure 5.2 shows a trie which is created by inserting pairs of key and value f1 ! a; 4 ! b; 5 ! c; 9 ! dg to the empty trie. 0 1 1:a 0 1 4:b 0 0 5:c 1 1 9:d Figure 5.3: An little-endian integer binary trie for the map f1 ! a; 4 ! b; 5 ! c; 9 ! dg. Because the de

360. nition of the integer trie is recursive, it's nature to de

361. ne the insertion algorithm recursively. If the LSB is 0, it means that the key to be inserted is even, we recursively insert to the left child, we can divide the

362. 5.2. INTEGER TRIE 105 key by 2 to get rid of the LSB. If the LSB is 1, the key is odd number, the recursive insertion is applied to the right child. For trie T, denote the left and right children as Tl and Tr respectively. Thus T = (Tl; d; Tr), where d is the optional satellite data. if T is empty, Tl, Tr and d are de

363. ned as empty as well. insert(T; k; v) = 8 : (Tl; v; Tr) : k = 0 (insert(Tl; k=2; v); d; Tr) : even(k) (Tl; d; insert(Tr; bk=2c; v)) : otherwise (5.1) If the key to be inserted already exists, this algorithm just overwrites the previous stored data. It can be replaced with other alternatives, such as storing data as with linked-list etc. The following Haskell example program implements the insertion algorithm. insert t 0 x = Branch (left t) (Just x) (right t) insert t k x j even k = Branch (insert (left t) (k `div` 2) x) (value t) (right t) j otherwise = Branch (left t) (value t) (insert (right t) (k `div` 2) x) left (Branch l _ _) = l left Empty = Empty right (Branch _ _ r) = r right Empty = Empty value (Branch _ v _) = v value Empty = Nothing For a given integer k with m bits in binary, the insertion algorithm reuses m levels. The performance is bound to O(m) time. 5.2.3 Look up To look up key k in the little-endian integer binary trie. We take each bit of k from the left (LSB), then go left if this bit is 0, otherwise, we go right. The looking up completes when all bits are consumed. 1: function Lookup(T; k) 2: while x6= 0 ^ T6=NIL do 3: if Even?(x) then 4: T Left(T) 5: else 6: T Right(T) 7: k bk=2c 8: if T6= NIL then 9: return Data(T) 10: else 11: return not found Below Python example code uses bit-wise operation to implements the look-ing up algorithm. def lookup(t, key): while key != 0 and (t is not None): if key 1 == 0:

364. 106 CHAPTER 5. TRIE AND PATRICIA t = t.left else: t = t.right key = key1 if t is not None: return t.value else: return None Looking up can also be de

365. ne in recursive manner. If the tree is empty, the looking up fails; If k = 0, the satellite data is the result to be found; If the last bit is 0, we recursively look up the left child; otherwise, we look up the right child. lookup(T; k) = 8 : : T = d : k = 0 lookup(Tl; k=2) : even(k) lookup(Tr; bk=2c) : otherwise (5.2) The following Haskell example program implements the recursive look up algorithm. search Empty k = Nothing search t 0 = value t search t k = if even k then search (left t) (k `div` 2) else search (right t) (k `div` 2) The looking up algorithm is bound to O(m) time, where m is the number of bits for a given key. 5.3 Integer Patricia Trie has some drawbacks. It wastes a lot of spaces. Note in

366. gure 5.2, only leafs store the real data. Typically, the integer binary trie contains many nodes only have one child. One improvement idea is to compress the chained nodes together. Patricia is such a data structure invented by Donald R. Morrison in 1968. Patricia means practical algorithm to retrieve information coded in alphanumeric[3]. It is another kind of pre

367. x tree. Okasaki gives implementation of integer Patricia in [2]. If we merge the chained nodes which have only one child together in

368. gure 5.3, We can get a Patricia as shown in

369. gure 5.4. From this

370. gure, we can

371. nd that the key for the sibling nodes is the longest common pre

372. x for them. They branches out at certain bit. Patricia saves a lot of spaces compare to trie. Dierent from integer trie, using the big-endian integer in Patricia doesn't cause the padding zero problem mentioned in section 5.2. All zero bits before MSB are omitted to save space. Okasaki lists some signi

373. cant advantages of big-endian Patricia[2]. 5.3.1 De

374. nition Integer Patricia tree is a special kind of binary tree. It is either empty or is a node. There are two dierent types of node.

375. 5.3. INTEGER PATRICIA 107 4:b 0 0 1 1 1:a 0 9:d 0 1 1 5:c Figure 5.4: Little endian Patricia for the map f1 ! a; 4 ! b; 5 ! c; 9 ! dg. It can be a leaf contains integer key and optional satellite data; or a branch node, contains the left and the right children. The two children shares the longest common pre

376. x bits for their keys. For the left child, the next bit of the key is zero, while the next bit is one for the right child. The following Haskell example code de

377. nes Patricia accordingly. type Key = Int type Prefix = Int type Mask = Int data IntTree a = Empty j Leaf Key a j Branch Prefix Mask (IntTree a) (IntTree a) In order to tell from which bit the left and right children dier, a mask is recorded in the branch node. Typically, a mask is power of 2, like 2n for some non-negative integer n, all bits being lower than n don't belong to the common pre

378. x. The following Python example code de

379. nes Patricia as well as some auxiliary functions. class IntTree: def __init__(self, key = None, value = None): self.key = key self.value = value self.prefix = self.mask = None self.left = self.right = None def set_children(self, l, r): self.left = l self.right = r def replace_child(self, x, y): if self.left == x: self.left = y

380. 108 CHAPTER 5. TRIE AND PATRICIA else: self.right = y def is_leaf(self): return self.left is None and self.right is None def get_prefix(self): if self.prefix is None: return self.key else: return self.prefix 5.3.2 Insertion When insert a key, if the tree is empty, we can just create a leaf node with the given key and satellite data, as shown in

381. gure 5.5). NIL 1 2 Figure 5.5: Left: the empty tree; Right: After insert key 12. If the tree is just a singleton leaf node x, we can create a new leaf y, put the key and data into it. After that, we need create a new branch node, and set x and y as the two children. In order to determine if the y should be the left or right node, we need

382. nd the longest common pre

383. x of x and y. For example if key(x) is 12 ((1100)2 in binary), key(y) is 15 ((1111)2 in binary), then the longest common pre

384. x is (11oo)2. Where o denotes the bits we don't care about. We can use another integer to mask those bits. In this case, the mask number is 4 (100 in binary). The next bit after the longest common pre

385. x presents 21. This bit is 0 in key(x), while it is 1 in key(y). We should set x as the left child and y as the right child. Figure 5.6 shows this example. In case the tree is neither empty, nor a singleton leaf, we need

386. rstly check if the key to be inserted matches the longest common pre

387. x recorded in the root. Then recursively insert the key to the left or the right child according to the next bit of the common pre

388. x. For example, if insert key 14 ((1110)2 in binary) to the result tree in

389. gure 5.6, since the common pre

390. x is (11oo)2, and the next bit (the bit of 21) is 1, we need recursively insert to the right child. If the key to be inserted doesn't match the longest common pre

391. x stored in the root, we need branch a new leaf out. Figure 5.7 shows these two dierent cases. For a given key k and value v, denote (k; v) is the leaf node. For branch node, denote it in form of (p; m; Tl; Tr), where p is the longest common pre

392. x, m is the mask, Tl and Tr are the left and right children. Summarize the above

393. 5.3. INTEGER PATRICIA 109 1 2 prefix=1100 mask=100 1 2 0 1 1 5 Figure 5.6: Left: A tree with singleton leaf 12; Right: After insert key 15. prefix=1100 mask=100 1 2 0 1 5 1 prefix=1100 mask=100 1 2 0 1 prefix=1110 mask=10 1 4 0 1 1 5 (a) Insert key 14. It matches the longest common pre

394. x (1100)2; 14 is then recursively inserted to the right sub tree. prefix=1100 mask=100 1 2 0 1 5 1 prefix=0 mask=10000 5 0 prefix=1110 mask=10 1 1 2 0 1 1 5 (b) Insert key 5. It doesn't match the longest common pre

395. x (1100)2, a new leaf is branched out. Figure 5.7: Insert key to branch node.

396. 110 CHAPTER 5. TRIE AND PATRICIA cases, the insertion algorithm can be de

397. ned as the following. insert(T; k; v) = 8 : (k; v) : T = _ T = (k; v0) join(k; (k; v); k0; T) : T = (k0; v0) (p; m; insert(Tl; k; v); Tr) : T = (p; m; Tl; Tr); match(k; p;m); zero(k;m) (p; m; Tl; insert(Tr; k; v)) : T = (p; m; Tl; Tr); match(k; p;m);:zero(k;m) join(k; (k; v); p; T) : T = (p; m; Tl; Tr);:match(k; p;m) (5.3) The

398. rst clause deals with the edge cases, that either T is empty or it is a leaf node with the same key. The algorithm overwrites the previous value for the later case. The second clause handles the case that T is a leaf node, but with dierent key. Here we need branch out another new leaf. We need extract the longest common pre

399. x, and determine which leaf should be set as the left, and which should be set as the right child. Function join(k1; T1;K2; T2) does this work. We'll de

400. ne it later. The third clause deals with the case that T is a branch node, the longest common pre

401. x matches the key to be inserted, and the next bit to the common pre

402. x is zero. Here we need recursively insert to the left child. The fourth clause handles the similar case as the third clause, except that the next bit to the common pre

403. x is one, but not zero. We need recursively insert to the right child. The last clause is for the case that the key to be inserted doesn't match the longest common pre

404. x stored in the branch. We need branch out a new leaf by calling the join function. We need de

405. ne function match(k; p;m) to test if the key k, has the same pre

406. x p above the masked bits m. For example, suppose the pre

407. x stored in a branch node is (pnpn1:::pi:::p0)2 in binary, key k is (knkn1:::ki:::k0)2 in binary, and the mask is (100:::0)2 = 2i. They match if and only if pj = kj for all i j n. One solution to realize match is to test if mask(k;m) = p is satis

408. ed. Where mask(x;m) = m 1x, that we perform bitwise-not of m 1, then perform bitwise-and with x. Function zero(k;m) test the next bit of the common pre

409. x is zero. With the help of the mask m, we can shift m one bit to the right, then perform bitwise and with the key. zero(k;m) = xshiftr(m; 1) (5.4) If the mask m = (100::0)2 = 2i, k = (knkn1:::ki1:::k0)2, because the bit next to ki is 1, zero(k;m) returns false value; if k = (knkn1:::ki0:::k0)2, then the result is true. Function join(p1; T1; p2; T2) takes two dierent pre

410. xes and trees. It extracts the longest common pre

411. x of p1 and p2, create a new branch node, and set T1 and T2 as the two children. join(p1; T1; p2; T2) = (p; m; T1; T2) : zero(p1;m); (p;m) = LCP(p1; p2) (p; m; T2; T1) : :zero(p1;m) (5.5)

412. 5.3. INTEGER PATRICIA 111 In order to calculate the longest common pre

413. x of p1 and p2, we can

414. rstly compute bitwise exclusive-or for them, then count the number of bits in this result, and generate a mask m = 2jxor(p1;p2)j. The longest common pre

415. x p can be given by masking the bits with m for either p1 or p2. p = mask(p1;m) (5.6) The following Haskell example code implements the insertion algorithm. import Data.Bits insert t k x = case t of Empty ! Leaf k x Leaf k' x' ! if k==k' then Leaf k x else join k (Leaf k x) k' t -- t@(Leaf k' x') Branch p m l r j match k p m ! if zero k m then Branch p m (insert l k x) r else Branch p m l (insert r k x) j otherwise ! join k (Leaf k x) p t -- t@(Branch p m l r) join p1 t1 p2 t2 = if zero p1 m then Branch p m t1 t2 else Branch p m t2 t1 where (p, m) = lcp p1 p2 lcp :: Prefix ! Prefix ! (Prefix, Mask) lcp p1 p2 = (p, m) where m = bit (highestBit (p1 `xor` p2)) p = mask p1 m highestBit x = if x == 0 then 0 else 1 + highestBit (shiftR x 1) mask x m = (x . complement (m-1)) -- complement means bit-wise not. zero x m = x . (shiftR m 1) == 0 match k p m = (mask k m) == p The insertion algorithm can also be realized imperatively. 1: function Insert(T; k; v) 2: if T = NIL then 3: T Create-Leaf(k; v) 4: return T 5: y T 6: p NIL 7: while y is not leaf, and Match(k, Prefix(y), Mask(y)) do 8: p y 9: if Zero?(k, Mask(y)) then 10: y Left(y) 11: else 12: y Right(y)

416. 112 CHAPTER 5. TRIE AND PATRICIA 13: if y is leaf, and k = Key(y) then 14: Data(y) v 15: else 16: z Branch(y, Create-Leaf(k; v)) 17: if p = NIL then 18: T z 19: else 20: if Left(p) = y then 21: Left(p) z 22: else 23: Right(p) z 24: return T Function Branch(T1; T2) does the similar job as what join is de

417. ned. It creates a new branch node, extracts the longest common pre

418. x, and sets T1 and T2 as the two children. 1: function Branch(T1; T2) 2: T Empty-Node 3: ( Prefix(T), Mask(T) ) LCP(Prefix(T1), Prefix(T2)) 4: if Zero?(Prefix(T1), Mask(T)) then 5: Left(T) T1 6: Right(T) T2 7: else 8: Left(T) T2 9: Right(T) T1 10: return T The following Python example program implements the insertion algorithm. def insert(t, key, value = None): if t is None: t = IntTree(key, value) return t node = t parent = None while(True): if match(key, node): parent = node if zero(key, node.mask): node = node.left else: node = node.right else: if node.is_leaf() and key == node.key: node.value = value else: new_node = branch(node, IntTree(key, value)) if parent is None: t = new_node else: parent.replace_child(node, new_node) break

419. 5.3. INTEGER PATRICIA 113 return t The auxiliary functions, match, branch, lcp etc. are given as below. def maskbit(x, mask): return x (~(mask-1)) def match(key, tree): return (not tree.is_leaf()) and maskbit(key, tree.mask) == tree.prefix def zero(x, mask): return x (mask1) == 0 def lcp(p1, p2): diff = (p1 ^ p2) mask=1 while(diff!=0): diff=1 mask1 return (maskbit(p1, mask), mask) def branch(t1, t2): t = IntTree() (t.prefix, t.mask) = lcp(t1.get_prefix(), t2.get_prefix()) if zero(t1.get_prefix(), t.mask): t.set_children(t1, t2) else: t.set_children(t2, t1) return t Figure 5.8 shows the example Patricia created with the insertion algorithm. prefix=0 mask=8 1:x 0 1 prefix=100 mask=2 4:y 0 1 5:z Figure 5.8: Insert map 1 ! x; 4 ! y; 5 ! z into the big-endian integer Patricia tree.

420. 114 CHAPTER 5. TRIE AND PATRICIA 5.3.3 Look up Consider the property of integer Patricia tree. When look up a key, if it has common pre

421. x with the root, then we check the next bit. If this bit is zero, we recursively look up the left child; otherwise if the bit is one, we next look up the right child. When reach a leaf node, we can directly check if the key of the leaf is equal to what we are looking up. This algorithm can be described with the following pseudo code. 1: function Look-Up(T; k) 2: if T = NIL then 3: return NIL . Not found 4: while T is not leaf, and Match(k, Prefix(T), Mask(T)) do 5: if Zero?(k, Mask(T)) then 6: T Left(T) 7: else 8: T Right(T) 9: if T is leaf, and Key(T) = k then 10: return Data(T) 11: else 12: return NIL . Not found Below Python example program implements the looking up algorithm. def lookup(t, key): if t is None: return None while (not t.is_leaf()) and match(key, t): if zero(key, t.mask): t = t.left else: t = t.right if t.is_leaf() and t.key == key: return t.value else: return None The looking up algorithm can also be realized in recursive approach. If the Patricia tree T is empty, or it's a singleton leaf with dierent key from what we are looking up, the result is empty to indicate not found error; If the tree is a singleton leaf, and the key of this leaf is equal to what we are looking up, we are done. Otherwise, T is a branch node, we need check if the common pre

422. x matches the key to be looked up, and recursively look up the child according to the next bit. If the common pre

423. x doesn't match the key, it means the key doesn't exist in the tree. We can return empty result to indicate not found error. lookup(T; k) = 8 : : T = _ (T = (k0; v); k06= k) v : T = (k0; v); k0 = k lookup(Tl; k) : T = (p; m; Tl; Tr); match(k; p;m); zero(k;m) lookup(Tr; k) : T = (p; m; Tl; Tr); match(k; p;m);:zero(k;m) : otherwise (5.7)

424. 5.4. ALPHABETIC TRIE 115 The following Haskell example program implements this recursive looking up algorithm. search t k = case t of Empty ! Nothing Leaf k' x ! if k==k' then Just x else Nothing Branch p m l r j match k p m ! if zero k m then search l k else search r k j otherwise ! Nothing 5.4 Alphabetic Trie Integer based Trie and Patricia Tree can be a good start point. Such technical plays important role in Compiler implementation. Okasaki pointed that the widely used Glasgow Haskell Compiler, GHC, utilized the similar implementa-tion for several years before 1998 [2]. If we extend the key from integer to alphabetic value, Trie and Patricia tree can be very powerful in solving textual manipulation problems. 5.4.1 De

425. nition It's not enough to just use the left and right children to represent alphabetic keys. Using English for example, there are 26 letters and each can be lower or upper case. If we don't care about the case, one solution is to limit the number of branches (children) to 26. Some simpli

426. ed ANSI C implementations of Trie are de

427. ned by using the array of 26 letters. This can be illustrated as in Figure 5.9. Not all branch nodes contain data. For instance, in Figure 5.9, the root only has three non-empty branches representing letter 'a', 'b', and 'z'. Other branches such as for letter c, are all empty. We don't show empty branches in the rest of this chapter. If deal with case sensitive problems, or handle languages other than English, there can be more letters than 26. The problem of dynamic size of sub branches can be solved by using some collection data structures. Such as Hash table or map. A alphabetic trie is either empty or a node. There are two types of node. A leaf node don't has any sub trees; A branch node contains multiple sub trees. Each sub tree is bound to a character. Both leaf and branch may contain optional satellite data. The following Haskell code shows the example de

428. nition. data Trie a = Trie { value :: Maybe a , children :: [(Char, Trie a)]} empty = Trie Nothing []

429. 116 CHAPTER 5. TRIE AND PATRICIA a a b c nil ... z n a n o t h e r another o o boy y l bool o o zoo Figure 5.9: A Trie with 26 branches, containing key 'a', 'an', 'another', 'bool', 'boy' and 'zoo'.

430. 5.4. ALPHABETIC TRIE 117 Below ANSI C code de

431. nes the alphabetic trie. For illustration purpose only, we limit the character set to lower case English letters, from 'a' to 'z'. struct Trie { struct Trie children[26]; void data; }; 5.4.2 Insertion When insert string as key, start from the root, we pick the character one by one from the string, examine which child represents the character. If the cor-responding child is empty, a new empty node is created. After that, the next character is used to select the proper grand child. We repeat this process for all the characters, and

432. nally store the optional satellite data in the node we arrived at. Below pseudo code describes the insertion algorithm. 1: function Insert(T; k; v) 2: if T = NIL then 3: T Empty-Node 4: p T 5: for each c in k do 6: if Children(p)[c] = NIL then 7: Children(p)[c] Empty-Node 8: p Children(p)[c] 9: Data(p) v 10: return T The following example ANSI C program implements the insertion algorithm. struct Trie insert(struct Trie t, const char key, void value){ int c; struct Trie p; if(!t) t = create_node(); for (p = t; key; ++key, p = p!children[c]) { c = key - 'a'; if (!p!children[c]) p!children[c] = create_node(); } p!data = value; return t; } Where function create node creates new empty node, with all children ini-tialized to empty. struct Trie create_node(){ struct Trie t = (struct Trie) malloc(sizeof(struct Trie)); int i; for (i=0; i26; ++i) t!children[i] = NULL; t!data = NULL; return t;

433. 118 CHAPTER 5. TRIE AND PATRICIA } The insertion can also be realized in recursive way. Denote the key to be inserted as K = k1k2:::kn, where ki is the i-th character. K0 is the rest of characters except k1. v0 is the satellite data to be inserted. The trie is in form T = (v;C), where v is the satellite data, C = f(c1; T1); (c2; T2); :::; (cm; Tm)g is the map of children. It maps from character ci to sub-tree Ti. if T is empty, then C is also empty. insert(T;K; v 0 ) = (v0;C) : K = (v; ins(C; k1;K0; v0)) : otherwise: (5.8) If the key is empty, the previous value v is overwritten with v0. Otherwise, we need check the children and perform recursive insertion. This is realized in function ins(C; k1;K0; v0). It examines key-sub tree pairs in C one by one. Let C0 be the rest of pairs except for the

434. rst one. This function can be de

435. ned as below. ins(C; k1;K 0 ; v 0 ) = 8 : f(k1; insert(;K0; v0))g : C = fk1; insert(T1;K0; v0)g [ C0 : k1 = c1 f(c1; T1)g [ ins(C0; k1;K0; v0) : otherwise (5.9) If C is empty, we create a pair, mapping from character k1 to a new empty tree, and recursively insert the rest characters. Otherwise, the algorithm locates the child which is mapped from k1 for further insertion. The following Haskell example program implements the insertion algorithm. insert t [] x = Trie (Just x) (children t) insert t (k:ks) x = Trie (value t) (ins (children t) k ks x) where ins [] k ks x = [(k, (insert empty ks x))] ins (p:ps) k ks x = if fst p == k then (k, insert (snd p) ks x):ps else p:(ins ps k ks x) 5.4.3 Look up To look up a key, we also extract the character from the key one by one. For each character, we search among the children to see if there is a branch match this character. If there is no such a child, the look up process terminates immediately to indicate the not found error. When we reach the last character of the key, The data stored in the current node is what we are looking up. 1: function Look-Up(T; key) 2: if T = NIL then 3: return not found 4: for each c in key do 5: if Children(T)[c] = NIL then 6: return not found 7: T Children(T)[c] 8: return Data(T) Below ANSI C program implements the look up algorithm. It returns NULL to indicate not found error.

436. 5.5. ALPHABETIC PATRICIA 119 void lookup(struct Trie t, const char key) { while (key t t!children[key - 'a']) t = t!children[key++ - 'a']; return (key j j !t) ? NULL : t!data; } The look up algorithm can also be realized in recursive manner. When look up a key, we start from the

437. rst character. If it is bound to some child, we then recursively search the rest characters in that child. Denote the trie as T = (v;C), the key being searched as K = k1k2:::kn if it isn't empty. The

438. rst character in the key is k1, and the rest characters are denoted as K0. lookup(T;K) = 8 : v : K = : find(C; k1) = lookup(T0;K0) : find(C; k1) = T0 (5.10) Where function find(C; k) examine the pairs of key-child one by one to check if any child is bound to character k. If the list of pairs C is empty, the result is empty to indicate non-existence of such a child. Otherwise, let C = f(k1; T1); (k2; T2); :::; (km; Tm)g, the

439. rst sub tree T1 is bound to k1; the rest of pairs are represented as C0. Below equation de

440. nes the find function. find(C; k) = 8 : : C = T1 : k1 = k find(C0; k) : otherwise (5.11) The following Haskell example program implements the trie looking up al-gorithm. It uses the find function provided in standard library[5]. find t [] = value t find t (k:ks) = case lookup k (children t) of Nothing ! Nothing Just t' ! find t' ks Exercise 5.1 Develop imperative trie by using collection data structure to manage mul-tiple sub trees in alphabetic trie. 5.5 Alphabetic Patricia Similar to integer trie, alphabetic trie is not memory ecient. We can use the same method to compress alphabetic trie to Patricia. 5.5.1 De

441. nition Alphabetic Patricia tree is a special pre

442. x tree, each node contains multiple branches. All children of a node share the longest common pre

443. x string. As the result, there is no node with only one child, because it con icts with the longest common pre

444. x property. If we turn the trie shown in

445. gure 5.9 into Patricia by compressing all nodes which have only one child. we can get a Patricia pre

446. x tree as in

447. gure 5.10.

448. 120 CHAPTER 5. TRIE AND PATRICIA a a bo zoo zoo n a n other another ol bool boy y Figure 5.10: A Patricia pre

449. x tree, with keys: 'a', 'an', 'another', 'bool', 'boy' and 'zoo'. We can modify the de

450. nition of alphabetic trie a bit to adapt it to Patricia. The Patricia is either empty, or a node in form T = (v;C). Where v is the optional satellite data; C = f(s1; T1); (s2; T2); :::; (sn; Tn)g is a list of pairs. Each pair contains a string si, which is bound to a sub tree Ti. The following Haskell example code de

451. nes Patricia accordingly. type Key = String data Patricia a = Patricia { value :: Maybe a , children :: [(Key, Patricia a)]} empty = Patricia Nothing [] Below Python code reuses the de

452. nition for trie to de

453. ne Patricia. class Patricia: def __init__(self, value = None): self.value = value self.children = {} 5.5.2 Insertion When insert a key, s, if the Patricia is empty, we create a leaf node as shown in

454. gure 5.11 (a). Otherwise, we need check the children. If there is some sub tree Ti bound to the string si, and there exists common pre

455. x between si and s, we need branch out a new leaf Tj . The method is to create a new internal branch node, bind it with the common pre

456. x. Then set Ti as one child of this branch, and Tj as the other child. Ti and Tj share the common pre

457. x. This is shown in

458. gure 5.11 (b). However, there are two special cases, because s may be the pre

459. x of si as shown in

460. gure 5.11 (c). And si may be the pre

461. x of s as in

462. gure 5.11 (d). The insertion algorithm can be described as below. 1: function Insert(T; k; v) 2: if T = NIL then

463. 5.5. ALPHABETIC PATRICIA 121 NIL boy (a) Insert key `boy' into the empty Patri- cia, the result is a leaf. bo ol y (b) Insert key `bool'. A new branch with common pre

464. x `bo' is created. another x p1 p2 ... a n y other x p1 p2 ... (c) Insert key `an' with value y into x with pre

465. x `another'. a n p1 ... another insert a n p1 ... insert other (d) Insert `another', into the node with pre

466. x `an'. We recursively insert key `other' to the child. Figure 5.11: Patricia insertion

467. 122 CHAPTER 5. TRIE AND PATRICIA 3: T Empty-Node 4: p T 5: loop 6: match FALSE 7: for each (si; Ti) 2 Children(p) do 8: if k = si then 9: Value(p) v 10: return T 11: c LCP(k; si) 12: k1 k c 13: k2 si c 14: if c6= NIL then 15: match TRUE 16: if k2 = NIL then . si is pre

468. x of k 17: p Ti 18: k k1 19: break 20: else . Branch out a new leaf 21: Children(p) Children(p) [f (c, Branch(k1; v; k2; Ti)) g 22: Delete(Children(p), (si; Ti)) 23: return T 24: if :match then . Add a new leaf 25: Children(p) Children(p) [f (k, Create-Leaf(v)) g 26: return T 27: return T In the above algorithm, LCP function

469. nds the longest common pre

470. x of two given strings, for example, string `bool' and `boy' has the longest common pre

471. x `bo'. The subtraction symbol '-' for strings gives the dierent part of two strings. For example `bool' - `bo' = `ol'. Branch function creates a branch node and updates keys accordingly. The longest common pre

472. x can be extracted by checking the characters in the two strings one by one till there are two characters don't match. 1: function LCP(A;B) 2: i 1 3: while i jAj ^ i jBj ^ A[i] = B[i] do 4: i i + 1 5: return A[1:::i 1] There are two cases when branch out a new leaf. Branch(s1; T1; s2; T2) takes two dierent keys and two trees. If s1 is empty, we are dealing the case such as insert key `an' into a child bound to string `another'. We set T2 as the child of T1. Otherwise, we create a new branch node and set T1 and T2 as the two children. 1: function Branch(s1; T1; s2; T2) 2: if s1 = then 3: Children(T1) Children(T1) [f(s2; T2)g 4: return T1 5: T Empty-Node

473. 5.5. ALPHABETIC PATRICIA 123 6: Children(T) f(s1; T1); (s2; T2)g 7: return T The following example Python program implements the Patricia insertion algorithm. def insert(t, key, value = None): if t is None: t = Patricia() node = t while True: match = False for k, tr in node.children.items(): if key == k: # just overwrite node.value = value return t (prefix, k1, k2) = lcp(key, k) if prefix != : match = True if k2 == : # example: insert another into an, go on traversing node = tr key = k1 break else: #branch out a new leaf node.children[prefix] = branch(k1, Patricia(value), k2, tr) del node.children[k] return t if not match: # add a new leaf node.children[key] = Patricia(value) return t return t Where the functions to

474. nd the longest common pre

475. x, and branch out are implemented as below. # returns (p, s1', s2'), where p is lcp, s1'=s1-p, s2'=s2-p def lcp(s1, s2): j = 0 while j len(s1) and j len(s2) and s1[j] == s2[j]: j += 1 return (s1[0:j], s1[j:], s2[j:]) def branch(key1, tree1, key2, tree2): if key1 == : #example: insert an into another tree1.children[key2] = tree2 return tree1 t = Patricia() t.children[key1] = tree1 t.children[key2] = tree2 return t The insertion can also be realized recursively. Start from the root, the pro-gram checks all the children to

476. nd if there is a node matches the key. Matching means they have the common pre

477. x. For duplicated keys, the program over-writes previous value. There are also alternative solution to handle duplicated

478. 124 CHAPTER 5. TRIE AND PATRICIA keys, such as using linked-list etc. If there is no child matches the key, the program creates a new leaf, and add it to the children. For Patricia T = (v;C), function insert(T; k; v0) inserts key k, and value v0 to the tree. insert(T; k; v 0 ) = (v; ins(C; k; v 0 )) (5.12) This function calls another internal function ins(C; k; v0). If the children C is empty, a new leaf is created; Otherwise the children are examined one by one. Denote C = f(k1; T1); (k2; T2); :::; (kn; Tn)g, C0 holds all the pre

479. x-sub tree pairs except for the

480. rst one. ins(C; k; v 0 ) = 8 : f(k; (v0; ))g : C = f(k; (v0;CT1 ))g [ C0 : k1 = k fbranch(k; v0; k1; T1)g [ C0 : match(k1; k) f(k1; T1)g [ ins(C0; k; v0) : otherwise (5.13) The

481. rst clause deals with the edge case of empty children. A leaf node containing v0 which is bound to k will be returned as the only child. The second clause overwrites the previous value with v0 if there is some child bound to the same key. Note the CT1 means the children of sub tree T1. The third clause branches out a new leaf if the

482. rst child matches the key k. The last clause goes on checking the rest children. We de

483. ne two keys A and B matching if they have non-empty common pre

484. x. match(A;B) = A6= ^ B6= ^ a1 = b1 (5.14) Where a1 and b1 are the

485. rst characters in A and B if they are not empty. Function branch(k1; v; k2; T2) takes tow keys, a value and a tree. Extract the longest common pre

486. x k = lcp(k1; k2), Denote the dierent part as k0 1 = k1 k, k0 2 = k2 k. The algorithm

487. rstly handles the edge cases that either k1 is the pre

488. x of k2 or k2 is the pre

489. x of k1. For the former case, It creates a new node containing v, bind this node to k, and set (k0 2; T2) as the only child; For the later case, It recursively insert k0 1 and v to T2. Otherwise, the algorithm creates a branch node, binds it to the longest common pre

490. x k, and set two children for it. One child is (k0 2; T2), the other is a leaf node containing v, and being bound to k0 1. branch(k1; v; k2; T2) = 8 : (k; (v; f(k0 2; T2)g)) : k = k1 (k; insert(T2; k0 1; v)) : k = k2 (k; (; f(k0 1; (v; )); (k0 2; T2)g) : otherwise (5.15) And function lcp(A;B) keeps taking same characters from A and B one by one. Denote a1 and b1 as the

491. rst characters in A and B if they are not empty. A0 and B0 are the rest parts except for the

492. rst characters. lcp(A;B) = : A = _ B = _ a16= b1 fa1g [ lcp(A0;B0) : a1 = b1 (5.16)

493. 5.5. ALPHABETIC PATRICIA 125 The following Haskell example program implements the Patricia insertion algorithm. insert t k x = Patricia (value t) (ins (children t) k x) where ins [] k x = [(k, Patricia (Just x) [])] ins (p:ps) k x j (fst p) == k = (k, Patricia (Just x) (children (snd p))):ps --overwrite j match (fst p) k = (branch k x (fst p) (snd p)):ps j otherwise = p:(ins ps k x) match x y = x == [] y == [] head x == head y branch k1 x k2 t2 j k1 == k -- ex: insert an into another = (k, Patricia (Just x) [(k2', t2)]) j k2 == k -- ex: insert another into an = (k, insert t2 k1' x) j otherwise = (k, Patricia Nothing [(k1', leaf x), (k2', t2)]) where k = lcp k1 k2 k1' = drop (length k) k1 k2' = drop (length k) k2 lcp [] _ = [] lcp _ [] = [] lcp (x:xs) (y:ys) = if x == y then x:(lcp xs ys) else [] 5.5.3 Look up When look up a key, we can't examine the characters one by one as in trie. Start from the root, we need search among the children to see if any one is bound to a pre

494. x of the key. If there is such a child, we update the key by removing the pre

495. x part, and recursively look up the updated key in this child. If there aren't any children bound to any pre

496. x of the key, the looking up fails. 1: function Look-Up(T; k) 2: if T = NIL then 3: return not found 4: repeat 5: match FALSE 6: for 8(ki; Ti) 2 Children(T) do 7: if k = ki then 8: return Data(Ti) 9: if ki is pre

497. x of k then 10: match TRUE 11: k k ki 12: T Ti 13: break

498. 126 CHAPTER 5. TRIE AND PATRICIA 14: until :match 15: return not found Below Python example program implements the looking up algorithm. It reuses the lcp(s1, s2) function de

499. ned previously to test if a string is the pre

500. x of the other. def lookup(t, key): if t is None: return None while True: match = False for k, tr in t.children.items(): if k == key: return tr.value (prefix, k1, k2) = lcp(key, k) if prefix != and k2 == : match = True key = k1 t = tr break if not match: return None This algorithm can also be realized recursively. For Patricia in form T = (v;C), it calls another function to

501. nd among the children C. lookup(T; k) = find(C; k) (5.17) If C is empty, the looking up fails; Otherwise, For C = f(k1; T1); (k2; T2); :::; (kn; Tn)g, we

502. rstly examine if k is the pre

503. x of k1, if not the recursively check the rest pairs denoted as C0. find(C; k) = 8 : : C = vT1 : k = k1 lookup(T1; k k1) : k1 @ k find(C0; k) : otherwise (5.18) Where A @ B means string A is pre

504. x of B. find mutually calls lookup if some child is bound to a string which is pre

505. x of the key. Below Haskell example program implements the looking up algorithm. import qualified Data.List find t k = find' (children t) k where find' [] _ = Nothing find' (p:ps) k j (fst p) == k = value (snd p) j (fst p) `Data.List.isPrefixOf` k = find (snd p) (diff (fst p) k) j otherwise = find' ps k diff k1 k2 = drop (length (lcp k1 k2)) k2

506. 5.6. TRIE AND PATRICIA APPLICATIONS 127 5.6 Trie and Patricia applications Trie and Patricia can be used to solving some interesting problems. Integer based pre

507. x tree is used in compiler implementation. Some daily used software applications have many interesting features which can be realized with trie or Patricia. In this section, some applications are given as examples, including, e-dictionary, word auto-completion, T9 input method etc. The commercial im-plementations typically do not adopt trie or Patricia directly. The solutions we demonstrated here are for illustration purpose only. 5.6.1 E-dictionary and word auto-completion Figure 5.12 shows a screen shot of an English-Chinese E-dictionary. In order to provide good user experience, the dictionary searches its word library, and lists all candidate words and phrases similar to what user has entered. Figure 5.12: E-dictionary. All candidates starting with what the user input are listed. A E-dictionary typically contains hundreds of thousands words. It's very ex-pensive to performs a whole word search. Commercial software adopts complex approaches, including caching, indexing etc to speed up this process. Similar with e-dictionary,

508. gure 5.13 shows a popular Internet search engine. When user input something, it provides a candidate lists, with all items starting with what the user has entered. And these candidates are shown in the order of popularity. The more people search, the upper position it is in the list. In both cases, the software provides a kind of word auto-completion mech-anism. In some modern IDEs, the editor can even help users to auto-complete program code. Let's see how to implementation of the e-dictionary with trie or Patricia. To simplify the problem, we assume the dictionary only supports English - English information.

509. 128 CHAPTER 5. TRIE AND PATRICIA Figure 5.13: A search engine. All candidates starting with what user input are listed. A dictionary stores key-value pairs, the keys are English words or phrases, the values are the meaning described in English sentences. We can store all the words and their meanings in a trie, but it isn't space eective especially when there are huge amount of items. We'll use Patricia to realize e-dictionary. When user wants to look up word 'a', the dictionary does not only return the meaning of 'a', but also provides a list of candidate words, which all start with 'a', including 'abandon', 'about', 'accent', 'adam', ... Of course all these words are stored in the Patricia. If there are too many candidates, one solution is only displaying the top 10 words, and the user can browse for more. The following algorithm reuses the looking up de

510. ned for Patricia. When it

511. nds a node bound to a string which is the pre

512. x of what we are looking for, it expands all its children until getting n candidates. 1: function Look-Up(T; k; n) 2: if T = NIL then 3: return 4: prefix NIL 5: repeat 6: match FALSE 7: for 8(ki; Ti) 2 Children(T) do 8: if k is pre

513. x of ki then 9: return Expand(Ti; prefix; n) 10: if ki is pre

514. x of k then 11: match TRUE 12: k k ki 13: T Ti 14: prefix prefix + ki

515. 5.6. TRIE AND PATRICIA APPLICATIONS 129 15: break 16: until :match 17: return Where function Expand(T; prefix; n) picks n sub trees, which share the same pre

516. x in T. It is realized as BFS (Bread-First-Search) traverse. Chapter search explains BFS in detail. 1: function Expand(T; prefix; n) 2: R 3: Q f(prefix; T)g 4: while jRj n ^ jQj 0 do 5: (k; T) Pop(Q) 6: if Data(T)6= NIL then 7: R R [ f(k; Data(T) )g 8: for 8(ki; Ti) 2 Children(T) do 9: Push(Q; (k + ki; Ti)) The following example Python program implements the e-dictionary appli-cation. When testing if a string is pre

517. x of another one, it uses the find function provided in standard string library. import string def patricia_lookup(t, key, n): if t is None: return None prefix = while True: match = False for k, tr in t.children.items(): if string.find(k, key) == 0: #is prefix of return expand(prefix+k, tr, n) if string.find(key, k) ==0: match = True key = key[len(k):] t = tr prefix += k break if not match: return None def expand(prefix, t, n): res = [] q = [(prefix, t)] while len(res)n and len(q)0: (s, p) = q.pop(0) if p.value is not None: res.append((s, p.value)) for k, tr in p.children.items(): q.append((s+k, tr)) return res This algorithm can also be implemented recursively, if the string we are look-ing for is empty, we expand all children until getting n candidates. Otherwise

518. 130 CHAPTER 5. TRIE AND PATRICIA we recursively examine the children to

519. nd one which has pre

520. x equal to this string. In programming environments supporting lazy evaluation. An intuitive so-lution is to expand all candidates, and take the

521. rst n on demand. Denote the Patricia pre

522. x tree in form T = (v;C), below function enumerates all items starts with key k. findAll(T; k) = 8 : enum(C) : k = ; v = f(; v)g [ enum(C) : k = ; v6= find(C; k) : k6= (5.19) The

523. rst two clauses deal with the edge cases the the key is empty. All the children are enumerated except for those with empty values. The last clause

524. nds child matches k. For non-empty children, C = f(k1; T1); (k2; T2); :::; (km; Tm)g, denote the rest pairs except for the

525. rst one as C0. The enumeration algorithm can be de

526. ned as below. enum(C) = : C = mapAppend(k1; findAll(T1; )) [ enum(C0) : (5.20) Where mapAppend(k;L) = f(k + ki; vi)j(ki; vi) 2 Lg. It concatenate the pre

527. x k in front of every key-value pair in list L. Function find(C; k) is de

528. ned as the following. For empty children, the result is empty as well; Otherwise, it examines the

529. rst child T1 which is bound to string k1. If k1 is equal to k, it calls mapAppend to add pre

530. x to the keys of all the children under T1; If k1 is pre

531. x of k, the algorithm recursively

532. nd all children start with k k1; On the other hand, if k is pre

533. x of k1, all children under T1 are valid result. Otherwise, the algorithm by-pass the

534. rst child and goes on

535. nd the rest children. find(C; k) = 8 : : C = mapAppend(k; findAll(T1; )) : k1 = k mapAppend(k1; findAll(T1; k k1)) : k1 @ k findAll(T1; ) : k @ k1 find(C0; k) : otherwise (5.21) Below example Haskell program implements the e-dictionary application ac-cording to the above equations. findAll :: Patricia a ! Key ! [(Key, a)] findAll t [] = case value t of Nothing ! enum $ children t Just x ! (, x):(enum $ children t) where enum [] = [] enum (p:ps) = (mapAppend (fst p) (findAll (snd p) [])) ++ (enum ps) findAll t k = find' (children t) k where find' [] _ = []

536. 5.6. TRIE AND PATRICIA APPLICATIONS 131 find' (p:ps) k j (fst p) == k = mapAppend k (findAll (snd p) []) j (fst p) `Data.List.isPrefixOf` k = mapAppend (fst p) (findAll (snd p) (k `diff` (fst p))) j k `Data.List.isPrefixOf` (fst p) = findAll (snd p) [] j otherwise = find' ps k diff x y = drop (length y) x mapAppend s lst = map (p!(s++(fst p), snd p)) lst In the lazy evaluation environment, the top n candidates can be gotten like take(n; findAll(T; k)). Appendix A has detailed de

537. nition of take function. 5.6.2 T9 input method Most mobile phones around year 2000 are equipped with a key pad. Users have quite dierent experience from PC when editing a short message or email. This is because the mobile-phone key pad, or so called ITU-T key pad has much fewer keys than PC. Figure 5.14 shows one example. Figure 5.14: an ITU-T keypad for mobile phone. There are typical two methods to input English word or phrases with ITU-T key pad. For instance, if user wants to enter a word `home', He can press the key in below sequence. Press key '4' twice to enter the letter 'h'; Press key '6' three times to enter the letter 'o'; Press key '6' twice to enter the letter 'm'; Press key '3' twice to enter the letter 'e'; Another much quicker way is to just press the following keys. Press key '4', '6', '6', '3', word `home' appears on top of the candidate list; Press key '*' to change a candidate word, so word `good' appears;

538. 132 CHAPTER 5. TRIE AND PATRICIA Press key '*' again to change another candidate word, next word `gone' appears; ... Compare these two methods, we can see the second one is much easier for the end user. The only overhead is to store a dictionary of candidate words. Method 2 is called as `T9' input method, or predictive input method [6], [7]. The abbreviation 'T9' stands for 'textonym'. It start with 'T' with 9 characters. T9 input can also be realized with trie or Patricia. In order to provide candidate words to user, a dictionary must be prepared in advance. Trie or Patricia can be used to store the dictionary. The commercial T9 implementations typically use complex indexing dictionary in both

539. le system and cache. The realization shown here is for illustration purpose only. Firstly, we need de

540. ne the T9 mapping, which maps from digit to candidate characters. MT9 = f 2 ! abc; 3 ! def; 4 ! ghi; 5 ! jkl; 6 ! mno; 7 ! pqrs; 8 ! tuv; 9 ! wxyzg (5.22) With this mapping, MT9[i] returns the corresponding characters for digit i. Suppose user input digits D = d1d2:::dn, If D isn't empty, denote the rest digits except for d1 as D0, below pseudo code shows how to realize T9 with trie. 1: function Look-Up-T9(T;D) 2: Q f(;D; T)g 3: R 4: while Q6= do 5: (prefix;D; T) Pop(Q) 6: for each c in MT9[d1] do 7: if c 2 Children(T) then 8: if D0 = then 9: R R [ fprefix + cg 10: else 11: Push(Q; (prefix + c;D0; Children(t)[c])) 12: return R Where prefix+c means appending character c to the end of string prefix. Again, this algorithm performs BFS search with a queue Q. The queue is ini-tialized with a tuple (prefix;D; T), containing empty pre

541. x, the digit sequence to be searched, and the trie. It keeps picking the tuple from the queue as far as it isn't empty. Then get the candidate characters from the

542. rst digit to be processed via the T9 map. For each character c, if there is a sub-tree bound to it, we created a new tuple, update the pre

543. x by appending c, using the rest of digits to update D, and use that sub-tree. This new tuple is pushed back to the queue for further searching. If all the digits are processed, it means a candidate word is found. We put this word to the result list R. The following example program in Python implements this T9 search with trie. T9MAP={'2':abc, '3':def, '4':ghi, '5':jkl, '6':mno, '7':pqrs, '8':tuv, '9':wxyz}

544. 5.6. TRIE AND PATRICIA APPLICATIONS 133 def trie_lookup_t9(t, key): if t is None or key == : return None q = [(, key, t)] res = [] while len(q)0: (prefix, k, t) = q.pop(0) i=k[0] if not i in T9MAP: return None #invalid input for c in T9MAP[i]: if c in t.children: if k[1:]==: res.append((prefix+c, t.children[c].value)) else: q.append((prefix+c, k[1:], t.children[c])) return res Because trie is not space eective, we can modify the above algorithm with Patricia solution. As far as the queue isn't empty, the algorithm pops the tuple. This time, we examine all the pre

545. x-sub tree pairs. For every pair (ki; Ti), we convert the alphabetic pre

546. x ki back to digits sequence D0 by looking up the T9 map. If D0 exactly matches the digits of what user input, we

547. nd a candidate word; otherwise if the digit sequence is pre

548. x of what user inputs, the program creates a new tuple, updates the pre

549. x, the digits to be processed, and the sub-tree. Then put the tuple back to the queue for further search. 1: function Look-Up-T9(T;D) 2: Q f(;D; T)g 3: R 4: while Q6= do 5: (prefix;D; T) Pop(Q) 6: for each (ki; Ti) 2 Children(T) do 7: D0 Convert-T9(ki) 8: if D0 @ D then . D0 is pre

550. x of D 9: if D0 = D then 10: R R [ fprefix + kig 11: else 12: Push(Q; (prefix + ki;D D0; Ti)) 13: return R Function Convert-T9(K) converts each character in K back to digit. 1: function Convert-T9(K) 2: D 3: for each c 2 K do 4: for each (d ! S) 2 MT9 do 5: if c 2 S then 6: D D [ fdg 7: break 8: return D The following example Python program implements the T9 input method with Patricia.

551. 134 CHAPTER 5. TRIE AND PATRICIA def patricia_lookup_t9(t, key): if t is None or key == : return None q = [(, key, t)] res = [] while len(q)0: (prefix, key, t) = q.pop(0) for k, tr in t.children.items(): digits = toT9(k) if string.find(key, digits)==0: #is prefix of if key == digits: res.append((prefix+k, tr.value)) else: q.append((prefix+k, key[len(k):], tr)) return res T9 input method can also be realized recursively. Let's

552. rst de

553. ne the trie solution. The algorithm takes two arguments, a trie storing all the candidate words, and a sequence of digits that is input by the user. If the sequence is empty, the result is empty as well; Otherwise, it looks up C to

554. nd those children which are bound to the

555. rst digit d1 according to T9 map. findT 9(T;D) = fg : D = fold(f;; lookupT 9(d1;C)) : otherwise (5.23) Where folding is de

556. ned in Appendix A. Function f takes two arguments, an intermediate list of candidates which is initialized empty, and a pair (c; T0), where c is candidate character, to which sub tree T0 is bound. It append char-acter c to all the candidate words, and concatenate this to the result list. f(L; (c; T 0 )) = mapAppend(c; findT 9(T 0 ;D 0 )) [ L (5.24) Note this mapAppend function is a bit dierent from the previous one de

557. ned in e-dictionary application. The

558. rst argument is a character, but not a string. Function lookupT 9(k;C) checks all the possible characters mapped to digit k. If the character is bound to some child in C, it is record as one candidate. lookupT 9(d;C) = fold(g;;MT9[k]) (5.25) Where g(L; k) = L : find(C; k) = f(k; T0)g [ L : find(C; k) = T0 (5.26) Below Haskell example program implements the T9 look up algorithm with trie. mapT9 = [('2', abc), ('3', def), ('4', ghi), ('5', jkl), ('6', mno), ('7', pqrs), ('8', tuv), ('9', wxyz)] findT9 t [] = [(, value t)] findT9 t (k:ks) = foldl f [] (lookupT9 k (children t)) where f lst (c, tr) = (mapAppend' c (findT9 tr ks)) ++ lst

559. 5.6. TRIE AND PATRICIA APPLICATIONS 135 lookupT9 c children = case lookup c mapT9 of Nothing ! [] Just s ! foldl f [] s where f lst x = case lookup x children of Nothing ! lst Just t ! (x, t):lst mapAppend' x lst = map (p!(x:(fst p), snd p)) lst There are few modi

560. cations when change the realization from trie to Patricia. Firstly, the sub-tree is bound to pre

561. x string, but not a single character. findT 9(T;D) = fg : D = fold(f;; findP refixT 9(D;C)) : otherwise (5.27) The list for folding is given by calling function findP refixT 9(D;C). And f is also modi

562. ed to re ect this change. It appends the candidate pre

563. x D0 in front of every result output by the recursive search, and then accumulates the words. f(L; (D 0 ; T 0 )) = mapAppend(D 0 ; findT 9(T 0 ;D D 0 )) [ L (5.28) Function findP refixT 9(D;C) examines all the children. For every pair (ki; Ti), if converting ki back to digits yields a pre

564. x of D, then this pair is selected as a candidate. findP refixT 9(D;C) = f(ki; Ti)j(ki; Ti) 2 C; convertT 9(ki) @ Dg (5.29) Function convertT 9(k) converts every alphabetic character in k back to dig-its according to T9 map. convertT 9(K) = fdj8c 2 k; 9(d ! S) 2 MT9 ) c 2 Sg (5.30) The following example Haskell program implements the T9 input algorithm with Patricia. findT9 t [] = [(, value t)] findT9 t k = foldl f [] (findPrefixT9 k (children t)) where f lst (s, tr) = (mapAppend s (findT9 tr (k `diff` s))) ++ lst diff x y = drop (length y) x findPrefixT9 s lst = filter f lst where f (k, _) = (toT9 k) `Data.List.isPrefixOf` s toT9 = map (c ! head $ [ d j(d, s) mapT9, c `elem` s]) Exercise 5.2 For the T9 input, compare the results of the algorithms realized with trie and Patricia, the sequences are dierent. Why does this happen? How to modify the algorithm so that they output the candidates with the same sequence?

565. 136 CHAPTER 5. TRIE AND PATRICIA 5.7 Summary In this chapter, we start from the integer base trie and Patricia. The map data structure based on integer Patricia plays an important role in Compiler implementation. Alphabetic trie and Patricia are natural extensions. They can be used to manipulate text information. As examples, predictive e-dictionary and T9 input method are realized with trie or Patricia. Although these examples are dierent from the real implementation in commercial software. They show simple approaches to solve some problems. Other important data structure, such as sux tree, has close relationship with them. Sux tree is introduced in the next chapter.

566. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clif-ford Stein. Introduction to Algorithms, Second Edition. Problem 12-1. ISBN:0262032937. The MIT Press. 2001 [2] Chris Okasaki and Andrew Gill. Fast Mergeable Integer Maps. Workshop on ML, September 1998, pages 77-86, http://www.cse.ogi.edu/ andy/pub- /

567. nite.htm [3] D.R. Morrison, PATRICIA { Practical Algorithm To Retrieve Information Coded In Alphanumeric, Journal of the ACM, 15(4), October 1968, pages 514-534. [4] Sux Tree, Wikipedia. http://en.wikipedia.org/wiki/Sux tree [5] Trie, Wikipedia. http://en.wikipedia.org/wiki/Trie [6] T9 (predictive text), Wikipedia. http://en.wikipedia.org/wiki/T9 (predictive text) [7] Predictive text, Wikipedia. http://en.wikipedia.org/wiki/Predictive text 137

568. 138 Sux Tree

569. Chapter 6 Sux Tree 6.1 Introduction Sux Tree is an important data structure. It can be used to realize many important string operations particularly fast[3]. It is also widely used in bio-information area such as DNA pattern matching[4]. Weiner introduced sux tree in 1973[2]. The latest on-line construction algorithm was found in 1995[1]. The sux tree for a string S is a special Patricia. Each edge is labeled with some sub-string of S. Each sux of S corresponds to exactly one path from the root to a leaf. Figure 6.1 shows the sux tree for English word `banana'. anana banana nana Figure 6.1: The sux tree for `banana' All suxes, 'banana', 'anana', 'nana', 'ana', 'na', 'a', can be found in the above tree. Among them the

570. rst 3 suxes are explicitly shown; others are implicitly represented. The reason for why 'ana, 'na', 'a', and are not shown is because they are pre

571. xes of the others. In order to show all suxes explicitly, we can append a special pad terminal symbol, which doesn't occur in other places in the string. Such terminator is typically denoted as '$'. By this means, there is no sux being the pre

572. x of the others. Although the sux tree for 'banana' is simple, the sux tree for 'bananas', as shown in

573. gure 6.2, is quite dierent. To create sux sux tree for a given string, we can utilize the insertion algorithm explained in previous chapter for Patricia. 1: function Suffix-Tree(S) 2: T NIL 3: for i 1 to jSj do 4: T Patricia-Insert(T, Right(S; i)) 5: return T 139

574. 140 CHAPTER 6. SUFFIX TREE a bananas n a s n a s n a s s n a s s Figure 6.2: The sux tree for `bananas' For non-empty string S = s1s2:::si:::sn of length n = jSj, function Right(S; i) = sisi+1:::sn. It extracts the sub-string of S from the i-th character to the last one. This straightforward algorithm can also be de

575. ned as below. suffixT (S) = fold(insertPatricia;; suffixes(S)) (6.1) Where function suffixes(S) gives all the suxes for string S. If the string is empty, the result is one empty string; otherwise, S itself is one sux, the others can be given by recursively call suffixes(S0), where S0 is given by drop the

576. rst character from S. suffixes(S) = fg : S = fSg [ suffixes(S0) : otherwise (6.2) This solution constructs sux tree in O(n2) time, for string of length n. It totally inserts n suxes to the tree, and each insertion takes linear time proportion to the length of the sux. The eciency isn't good enough. In this chapter, we

577. rstly explain a fast on-line sux trie construction so-lution by using sux link concept. Because trie isn't space ecient, we next introduce a linear time on-line sux tree construction algorithm found by Ukko-nen. and show how to solve some interesting string manipulation problems with sux tree. 6.2 Sux trie Just likes the relationship between trie and Patricia, Sux trie has much simpler structure than sux tree. Figure 6.3 shows the sux trie for 'banana'. Compare with

578. gure 6.1, we can

579. nd the dierence between sux tree and sux trie. Instead of representing a word, every edge in sux trie only repre-sents a character. Thus sux trie needs much more spaces. If we pack all nodes which have only one child, the sux trie is turned into a sux tree. We can reuse the trie de

580. nition for sux tree. Each node is bound to a character, and contains multiple sub trees as children. A child can be referred from the bounded character.

581. 6.2. SUFFIX TRIE 141 a b n n a n a a n a n a a n a Figure 6.3: Sux trie for 'banana' 6.2.1 Node transfer and sux link For string S of length n, de

582. ne Si = s1s2:::si. It is the pre

583. x contains the

584. rst i characters. In sux trie, each node represents a sux string. for example in

585. gure 6.4, node X represents sux 'a', by adding character 'c', node X transfers to Y which represents sux 'ac'. We say node X transfer to Y with the edge of character 'c'[1]. Y Children(X)[c] We also say that node X has a 'c'-child Y . Below Python expression re ects this concept. y = x.children[c] If node A in a sux trie represents sux sisi+1:::sn, and node B represents sux si+1si+2:::sn, we say node B represents the sux of node A. We can create a link from A to B. This link is de

586. ned as the sux link of node A[1]. Sux link is drawn in dotted style. In

587. gure 6.4, the sux link of node A points to node B, and the sux link of node B points to node C. Sux link is valid for all nodes except the root. We can add a sux link

588. eld to the trie de

589. nition. Below Python example code shows this update. class STrie: def __init__(self, suffix=None): self.children = {} self.suffix = suffix

590. 142 CHAPTER 6. SUFFIX TREE root X a o c c Y C o a A o B a o c a o Figure 6.4: Sux trie for string cacao. Node X a, node Y ac, X transfers to Y with character 'c' sux string s1s2s3:::si s2s3:::si ... si1si si Table 6.1: suxes for Si 6.2.2 On-line construction For string S, Suppose we have constructed sux trie for the i-th pre

591. x Si = s1s2:::si. Denote it as SuffixT rie(Si). Let's consider how to obtain SuffixT rie(Si+1) from SuffixT rie(Si). If list all suxes corresponding to SuffixT rie(Si), from the longest (which is Si) to the shortest (which is empty), we can get table 6.1. There are total i + 1 suxes. One solution is to append the character si+1 to every sux in this table, then add another empty string. This idea can be realized by adding a new child for every node in the trie, and binding all these new child with edge of character si+1. However, some nodes in SuffixT rie(Si) may have the si+1-child already. For example, in

592. gure 6.5, node X and Y are corresponding for sux 'cac' and 'ac' respectively. They don't have the 'a'-child. But node Z, which represents sux 'c' has the 'a'-child already.

593. 6.2. SUFFIX TRIE 143 Algorithm 1 Update SuffixT rie(Si) to SuffixT rie(Si+1), initial version. 1: for 8T 2 SuffixT rie(Si) do 2: Children(T)[si+1] Create-Empty-Node root a Z c c a Y X c (a) Sux trie for string cac. root a c c a a c a (b) Sux trie for string caca. Figure 6.5: Sux Trie of cac and caca

594. 144 CHAPTER 6. SUFFIX TREE When append si+1 to SuffixT rie(Si). In this example si+1 is character 'a'. We need create new nodes for X and Y , but we needn't do this for Z. If check the nodes one by one according to table 6.1, we can stop immediately when meet a node which has the si+1-child. This is because if node X in SuffixT rie(Si) has the si+1-child, according to the de

595. nition of sux link, any sux nodes X0 of X in SuffixT rie(Si) must also have the si+1-child. In other words, let c = si+1, if wc is a sub-string of Si, then every sux of wc is also a sub-string of Si [1]. The only exception is the root, which represents for empty string . According to this fact, we can re

596. ne the algorithm 1 to the following. Algorithm 2 Update SuffixT rie(Si) to SuffixT rie(Si+1), second version. 1: for each T 2 SuffixT rie(Si) in descending order of sux length do 2: if Children(T)[si+1] = NIL then 3: Children(T)[si+1] Create-Empty-Node 4: else 5: break The next question is how to iterate all nodes in descending order of the sux length? De

597. ne the top of a sux trie as the deepest leaf node. This de

598. nition ensures the top represents the longest sux. Along the sux link from the top to the next node, the length of the sux decrease by one. This fact tells us that We can traverse the sux tree from the top to the root by using the sux links. And the order of such traversing is exactly what we want. Finally, there is a special sux trie for empty string SuffixT rie(NIL), We de

599. ne the top equals to the root in this case. function Insert(top; c) if top = NIL then . The trie is empty top Create-Empty-Node T top T0 Create-Empty-Node . dummy init value while T6= NIL ^ Children(T)[c] = NIL do Children(T)[c] Create-Empty-Node Suffix-Link(T0) Children(T)[c] T0 Children(T)[c] T Suffix-Link(T) if T6= NIL then Suffix-Link(T0) Children(T)[c] return Children(top)[c] . returns the new top Function Insert, updates SuffixT rie(Si) to SuffixT rie(Si+1). It takes two arguments, one is the top of SuffixT rie(Si), the other is si+1 character. If the top is NIL, it means the tree is empty, so there is no root. The algorithm creates a root node in this case. A sentinel empty node T0 is created. It keeps tracking the previous created new node. In the main loop, the algorithm checks every node one by one along the sux link. If the node hasn't the si+1-child, it then creates a new node, and binds the edge to character si+1. The algorithm repeatedly goes up along the sux link until either arrives at the root, or

600. nd a node which has the si+1-child already. After the loop, if the node isn't empty,

601. 6.2. SUFFIX TRIE 145 it means we stop at a node which has the si+1-child. The last sux link then points to that child. Finally, the new top position is returned, so that we can further insert other characters to the sux trie. For a given string S, the sux trie can be built by repeatedly calling Insert function. 1: function Suffix-Trie(S) 2: t NIL 3: for i 1 to jSj do 4: t Insert(t; si) 5: return t This algorithm returns the top of the sux trie, but not the root. In order to access the root, we can traverse along the sux link. 1: function Root(T) 2: while Suffix-Link(T)6= NIL do 3: T Suffix-Link(T) 4: return T Figure 6.6 shows the steps when construct sux trie for cacao. Only the last layer of sux links are shown. For Insert algorithm, the computation time is proportion to the size of sux trie. In the worse case, the sux trie is built in O(n2) time, where n = jSj. One example is S = anbn, that there are n characters of a and n characters of b. The following example Python program implements the sux trie construc-tion algorithm. def suffix_trie(str): t = None for c in str: t = insert(t, c) return root(t) def insert(top, c): if top is None: top=STrie() node = top new_node = STrie() #dummy init value while (node is not None) and (c not in node.children): new_node.suffix = node.children[c] = STrie(node) new_node = node.children[c] node = node.suffix if node is not None: new_node.suffix = node.children[c] return top.children[c] #update top def root(node): while node.suffix is not None: node = node.suffix return node

602. 146 CHAPTER 6. SUFFIX TREE root (a) Empty root c (b) c root a c a (c) ca root a Z c c a Y X c (d) cac root a c c a a c a (e) caca root X a o c c Y C o a A o B a o c a o (f) cacao Figure 6.6: Construct sux trie for cacao. There are 6 steps. Only the last layer of sux links are shown in dotted arrow.

603. 6.3. SUFFIX TREE 147 6.3 Sux Tree Sux trie isn't space ecient, and the construction time is quadratic. If don't care about the speed, we can compress the sux trie to sux tree[6]. Ukkonen found a linear time on-line sux tree construction algorithm in 1995. 6.3.1 On-line construction Active point and end point The sux trie construction algorithm shows very important fact about what happens when SuffixT rie(Si) updates to SuffixT rie(Si+1). Let's review the last two steps in

604. gure 6.6. There are two dierent updates. 1. All leaves are appended with a new node for si+1; 2. Some non-leaf nodes are branched out with a new node for si+1. The

605. rst type of update is trivial, because for all new coming characters, we need do this work anyway. Ukkonen de

606. nes leaf as the 'open' node. The second type of update is important. We need

607. gure out which internal nodes need branch out. We only focus on these nodes and apply the update. Ukkonen de

608. nes the path along the sux links from the top to the end as 'boundary path'. Denote the nodes in boundary path as, n1; n2; :::; nj ; :::; nk. These nodes start from the leaf node (the

609. rst one is the top position), suppose that after the j-th node, they are not leaves any longer, we need repeatedly branch out from this time point till the k-th node. Ukkonen de

610. nes the

611. rst none-leaf node nj as 'active point' and the last node nk as 'end point'. The end point can be the root. Reference pair X a bananas n a s Y n a s n a s s n a s s Figure 6.7: Sux tree of bananas. X transfer to Y with sub-string na. Figure 6.7 shows the sux tree of English word bananas. Node X repre-sents sux a. By adding sub-string na, node X transfers to node Y , which

612. 148 CHAPTER 6. SUFFIX TREE represents sux ana. In other words, we can represent Y with a pair of node and sub-string, like (X;w), where w = na. Ukkonen de

613. nes such kind of pair as reference pair. Not only the explicit node, but also the implicit position in sux tree can be represented with reference pair. For example, (X, n ) represents to a position which is not an explicit node. By using reference pair, we can represent every position in a sux tree. In order to save spaces, for string S, all sub-strings can be represented as a pair of index (l; r), where l is the left index and r is the right index of the character for the sub-string. For instance, if S = bananas, and the index starts from 1, sub-string na can be represented with pair (3; 4). As the result, there is only one copy of the complete string, and all positions in the sux tree is de

614. ned as (node; (l; r)). This is the

615. nal form of reference pair. With reference pair, node transfer for sux tree can be de

616. ned as the fol-lowing. Children(X)[sl] ((l; r); Y ) () Y (X; (l; r)) If character sl = c, we say that node X has a c-child, This child is Y . Y can be transferred from X with sub string (l; r) Each node can have at most one c-child. canonical reference pair It's obvious that the one position in a sux tree may have multiple reference pairs. For example, node Y in Figure 6.7 can be either denoted as (X; (3; 4)) or (root; (2; 4)). If we de

617. ne empty string = (i; i 1), Y can also be represented as (Y; ). The canonical reference pair is the one which has the closest node to the po-sition. Specially, in case the position is an explicit node, the canonical reference pair is (node; ), so (Y; ) is the canonical reference pair of node Y . Below algorithm converts a reference pair (node; (l; r)) to the canonical ref-erence pair (node0; (l0; r)). Note that since r doesn't change, the algorithm can only return (node0; l0) as the result. Algorithm 3 Convert reference pair to canonical reference pair 1: function Canonize(node; (l; r)) 2: if node = NIL then 3: if (l; r) = then 4: return ( NIL, l) 5: else 6: return Canonize(root; (l + 1; r)) 7: while l r do . (l; r)isn0tempty 8: ((l0; r0); node0) Children(node)[sl] 9: if r l r0 l0 then 10: l l + r0 l0 + 1 . Remove j(l0; r0)j chars from (l; r) 11: node node0 12: else 13: break 14: return (node; l) If the passed in node parameter is NIL, it means a very special case. The

618. 6.3. SUFFIX TREE 149 function is called like the following. Canonize(Suffix-Link(root), (l; r)) Because the sux link of root points to NIL, the result should be (root; (l+ 1; r)) if (l; r) is not . Otherwise, (NIL, ) is returned to indicate a terminal position. We explain this special case in detail in later sections. The algorithm In 6.3.1, we mentioned, all updating to leaves is trivial, because we only need append the new coming character to the leaf. With reference pair, it means, when update SuffixT ree(Si) to SuffixT ree(Si+1), all reference pairs in form (node; (l; i)), are leaves. They will change to (node; (l; i+1)) next time. Ukkonen de

619. nes leaf in form (node; (l;1)), here 1 means open to grow. We can skip all leaves until the sux tree is completely constructed. After that, we can change all 1 to the length of the string. So the main algorithm only cares about positions from the active point to the end point. However, how to

620. nd the active point and the end point? When start sux tree construction, there is only a root node. There aren't any branches or leaves. The active point should be (root; ), or (root; (1; 0)) (the string index starts from 1). For the end point, it is a position where we can

621. nish updating SuffixT ree(Si). According to the sux trie algorithm, we know it should be a position which has the si+1-child already. Because a position in sux trie may not be an explicit node in sux tree, if (node; (l; r)) is the end point, there are two cases. 1. (l; r) = . It means the node itself is the end point. This node has the si+1-child, which means Children(node)[si+1]6= NIL; 2. Otherwise, l r, the end point is an implicit position. It must sat-isfy si+1 = sl0+j(l;r)j, where Children(node)[sl]= ((l0; r0); node0), j(l; r)j means the length of sub-string (l; r). It equals to r l + 1. This is illus-trated in

622. gure 6.8. We can also say that (node; (l; r)) has a si+1-child implicitly. Figure 6.8: Implicit end point

623. 150 CHAPTER 6. SUFFIX TREE Ukkonen

624. nds a very important fact that if (node; (l; i)) is the end point of SuffixT ree(Si), then (node; (l; i+1)) is the active point of SuffixT ree(Si+1). This is because if (node; (l; i)) is the end point of SuffixT ree(Si), it must have a si+1-child (either explicitly or implicitly). If this end point represents sux sksk+1:::si, it is the longest sux in SuffixT ree(Si) which satis

625. es sksk+1:::sisi+1 is a sub-string of Si. Consider Si+1, sksk+1:::sisi+1 must oc-cur at least twice in Si+1, so position (node; (l; i + 1)) is the active point of SuffixT ree(Si+1). Figure 6.9 shows about this truth. Figure 6.9: End point in SuffixT ree(Si) and active point in SuffixT ree(Si+1). Summarize the above facts, the algorithm of Ukkonen's on-line construction can be given as the following. 1: function Update(node; (l; i)) 2: prev Create-Empty-Node . Initialized as sentinel 3: loop . Traverse along the sux links 4: (finish; node0) End-Point-Branch?(node; (l; i 1); si) 5: if finish then 6: break 7: Children(node0)[si] ((i;1), Create-Empty-Node) 8: Suffix-Link(prev) node0 9: prev node0 10: (node; l) = Canonize(Suffix-Link(node), (l; i 1)) 11: Suffix-Link(prev) node 12: return (node; l) . The end point This algorithm takes reference pair (node; (l; i)) as arguments, note that position (node; (l; i 1) is the active point for SuffixT ree(Si1). Then we start a loop, this loop goes along the sux links until the current position (node; (l; i 1)) is the end point. Otherwise, function End-Point-Branch?

626. 6.3. SUFFIX TREE 151 returns a position, from where the new leaf branch out. The End-Point-Branch? algorithm is realized as below. function End-Point-Branch?(node; (l; r); c) if (l; r) = then if node = NIL then return (TRUE, root) else return (Children(node)[c] = NIL, node) else ((l0; r0); node0) Children(node)[sl] pos l0 + j(l; r)j if spos = c then return (TRUE, node) else p Create-Empty-Node Children(node)[sl0 ] ((l0; pos 1); p) Children(p)[spos] ((pos; r0); node0) return (FALSE, p) If the position is (root; ), it means we have arrived at the root. It's de

627. nitely the end point, so that we can

628. nish this round of updating. If the position is in form of (node; ), it means the reference pair represents an explicit node, we can examine if this node has already the c-child, where c = si. If not, we need branch out a leaf. Otherwise, the position (node; (l; r)) points to an implicit node. We need

629. nd the exact position next to it to see if there is a c-child. If yes, we meet an end point, the updating loop can be

630. nished; else, we turn the position to an explicit node, and return it for further branching. We can

631. nalize the Ukkonen's algorithm as below. 1: function Suffix-Tree(S) 2: root Create-Empty-Node 3: node root; l 0 4: for i 1 to jSj do 5: (node; l) = Update(node; (l; i)) 6: (node; l) = Canonize(node; (l; i)) 7: return root Figure 6.10 shows the steps when constructing the sux tree for string ca-cao. Note that we needn't set sux link for leaf nodes, only branch nodes need sux links. The following example Python program implements Ukkonen's algorithm. First is the node de

632. nition. class Node: def __init__(self, suffix=None): self.children = {} # 'c':(word, Node), where word = (l, r) self.suffix = suffix Because there is only one copy of the complete string, all sub-strings are represent in (left; right) pairs, and the leaf are open pairs as (left;1). The sux tree is de

633. ned like below.

634. 152 CHAPTER 6. SUFFIX TREE root (a) Empty c (b) c a ca (c) ca ac cac (d) cac aca caca (e) caca c a a o cao o cao o (f) cacao Figure 6.10: Construct sux tree for cacao. There are 6 steps. Only the last layer of sux links are shown in dotted arrow. class STree: def __init__(self, s): self.str = s self.infinity = len(s)+1000 self.root = Node() The in

635. nity is de

636. ned as the length of the string plus a big number. Some auxiliary functions are de

637. ned. def substr(str, str_ref): (l, r)=str_ref return str[l:r+1] def length(str_ref): (l, r)=str_ref return r-l+1 The main entry for Ukkonen's algorithm is implemented as the following. def suffix_tree(str): t = STree(str) node = t.root # init active point is (root, Empty) l = 0 for i in range(len(str)): (node, l) = update(t, node, (l, i)) (node, l) = canonize(t, node, (l, i)) return t def update(t, node, str_ref): (l, i) = str_ref

638. 6.3. SUFFIX TREE 153 c = t.str[i] # current char prev = Node() # dummy init while True: (finish, p) = branch(t, node, (l, i-1), c) if finish: break p.children[c]=((i, t.infinity), Node()) prev.suffix = p prev = p (node, l) = canonize(t, node.suffix, (l, i-1)) prev.suffix = node return (node, l) def branch(t, node, str_ref, c): (l, r) = str_ref if length(str_ref)0: # (node, empty) if node is None: #_j_ return (True, t.root) else: return ((c in node.children), node) else: ((l1, r1), node1) = node.children[t.str[l]] pos = l1+length(str_ref) if t.str[pos]==c: return (True, node) else: branch_node = Node() node.children[t.str[l1]]=((l1, pos-1), branch_node) branch_node.children[t.str[pos]] = ((pos, r1), node1) return (False, branch_node) def canonize(t, node, str_ref): (l, r) = str_ref if node is None: if length(str_ref)0: return (None, l) else: return canonize(t, t.root, (l+1, r)) while lr: # str_ref is not empty ((l1, r1), child) = node.children[t.str[l]] if r-l r1-l1: l += r1-l1+1 node = child else: break return (node, l) Functional sux tree construction Giegerich and Kurtz found Ukkonen's algorithm can be transformed to Mc- Creight's algorithm[7]. The three sux tree construction algorithms found by Weiner, McCreight, and Ukkonen are all bound to O(n) time. Giegerich and Kurtz conjectured any sequential sux tree construction method doesn't base

639. 154 CHAPTER 6. SUFFIX TREE on sux links, active suxes, etc., fails to meet the O(n)-criterion. There is implementation in PLT/Scheme[10] based on Ukkonen's algorithm, However, it updates sux links during the processing, which is not purely func-tional. A lazy sux tree construction method is discussed in [8]. And this method is contributed to Haskell Hackage by Bryan O'Sullivan. [9]. The method de-pends on the lazy evaluation property. The tree won't be constructed until it is traversed. However, it can't ensure the O(n) performance if the programming environments or languages don't support lazy evaluation. The following Haskell program de

640. nes the sux tree. A sux tree is either a leaf, or a branch containing multiple sub trees. Each sub tree is bound to a string. data Tr = Lf j Br [(String, Tr)] deriving (Eq) type EdgeFunc = [String]!(String, [String]) The edge function extracts a common pre

641. x from a list of strings. The pre

642. x returned by edge function may not be the longest one, empty string is also allowed. The exact behavior can be customized with dierent edge functions. build(edge;X) This de

643. nes a generic radix tree building function. It takes an edge function, and a set of strings. X can be all suxes of a string, so that we get sux trie or sux tree. We'll also explain later that X can be all pre

644. xes, which lead to normal pre

645. x trie or Patricia. Suppose all the strings are built from a character set . When build the tree, if the string is empty, X only contains one empty string as well. The result tree is an empty leaf; Otherwise, we examine every character in , group the strings in X with their initial characters, the apply the edge function to these groups. build(edge;X) == 8 : leaf : X = fg branch(f(fcg [ p; build(edge;X0))j c 2 ; G 2 fgroup(X; c)g; (p;X0) 2 fedge(G)gg) : otherwise (6.3) The algorithm categorizes all suxes by the

646. rst letter in several groups. It removes the

647. rst letter for each element in every group. For example, the suxes facac, cac, ac, cg are categorized to groups f('a', [cac, c]), ('c', [ac, ])g. group(X; c) = fC 0jfc1g [ C 0 2 X; c1 = cg (6.4) Function group enumerates all suxes in X, for each one, denote the

648. rst character as c1, the rest characters as C0. If c1 is same as the given character c, then C0 is collected. Below example Haskell program implements the generic radix tree building algorithm.

649. 6.3. SUFFIX TREE 155 alpha = ['a'..'z']++['A'..'Z'] lazyTree::EdgeFunc ! [String] ! Tr lazyTree edge = build where build [[]] = Lf build ss = Br [(a:prefix, build ss') j a alpha, xs@(x:_) [[cs j c:cs ss, c==a]], (prefix, ss') [edge xs]] Dierent edge functions produce dierent radix trees. Since edge function extracts common pre

650. x from a set of strings. The simplest one constantly uses the empty string as the common pre

651. x. This edge function builds a trie. edgeT rie(X) = (;X) (6.5) We can also realize an edge function, that extracts the longest common pre

652. x. Such edge function builds a Patricia. Denote the strings as X = fx1; x2; :::; xng, for the each string xi, let the initial character be ci, and the rest characters in xi as Wi. If there is only one string in X, the longest com-mon pre

653. x is de

654. nitely this string; If there are two strings start with dierent initial characters, the longest common pre

655. x is empty; Otherwise,it means all the strings share the same initial character. This character de

656. nitely belongs to the longest common pre

657. x. We can remove it from all strings, and recursively call the edge function. edgeT ree(X) = 8 : (x1; fg) : X = fx1g (;X) : jXj 1; 9xi 2 X; ci6= c1 (fc1g [ p; Y ) : (p; Y ) = edgeT ree(fWijxi 2 Xg) (6.6) Here are some examples for edgeT ree function. edgeT ree(fan, another, andg) = (an; f, other, dg) edgeT ree(fbool, foo, barg) = (; fbool, fool, barg) The following example Haskell program implements this edge function. edgeTree::EdgeFunc edgeTree [s] = (s, [[]]) edgeTree awss@((a:w):ss) j null [cjc:_ ss, a==c] = (a:prefix, ss') j otherwise = (, awss) where (prefix, ss') = edgeTree (w:[uj _:u ss]) edgeTree ss = (, ss) For any given string, we can build sux trie and sux tree by feeding suxes to these two edge functions. suffixT rie(S) = build(edgeT rie; suffixes(S)) (6.7) suffixT ree(S) = build(edgeT ree; suffixes(S)) (6.8) Because the build(edge;X) is generic, it can be used to build other radix trees, such as the normal pre

658. x trie and Patricia. trie(S) = build(edgeT rie; prefixes(S)) (6.9)

659. 156 CHAPTER 6. SUFFIX TREE tree(S) = build(edgeT ree; prefixes(S)) (6.10) 6.4 Sux tree applications Sux tree can help to solve many string and DNA pattern manipulation prob-lems particularly fast. 6.4.1 String/Pattern searching There a plenty of string searching algorithms, such as the famous KMP(Knuth- Morris-Pratt algorithm is introduced in the chapter of search) algorithm. Sux tree can perform at the same level as KMP[11]. the string searching in bound to O(m) time, where m is the length of the sub-string to be search. However, O(n) time is required to build the sux tree in advance, where n is the length of the text[12]. Not only sub-string searching, but also pattern matching, including regular expression matching can be solved with sux tree. Ukkonen summarizes this kind of problems as sub-string motifs: For a string S, SuffixT ree(S) gives complete occurrence counts of all sub-string motifs of S in O(n) time, although S may have O(n2) sub-strings. Find the number of sub-string occurrence Every internal node in SuffixT ree(S) is corresponding to a sub-string occurs multiple times in S. If this sub-string occurs k times in S, then there are total k sub-trees under this node[13]. 1: function Lookup-Pattern(T; s) 2: loop 3: match FALSE 4: for 8(si; Ti) 2 Values(Children(T)) do 5: if s @ si then 6: return Max(jChildren(Ti)j, 1) 7: else if si @ s then 8: match TRUE 9: T Ti 10: s s si 11: break 12: if :match then 13: return 0 When look up a sub-string pattern s in text w, we build the sux tree T from the text. Start from the root, we iterate all children. For every string reference pair si and sub-tree Ti, we check if the s is pre

660. x of si. If yes, the number of sub-trees in Ti is returned as the result. There is a special case that Ti is a leaf without any children. We need return 1 but not zero. This is why we use the maximum function. Otherwise, if si is pre

661. x of s, then we remove si part from s, and recursively look up in Ti. The following Python program implements this algorithm.

662. 6.4. SUFFIX TREE APPLICATIONS 157 def lookup_pattern(t, s): node = t.root while True: match = False for _, (str_ref, tr) in node.children.items(): edge = substr(t, str_ref) if string.find(edge, s)==0: #s `isPrefixOf` edge return max(len(tr.children), 1) elif string.find(s, edge)==0: #edge `isPrefixOf` s match = True node = tr s = s[len(edge):] break if not match: return 0 return 0 # not found This algorithm can also be realized in recursive way. For the non-leaf sux tree T, denote the children as C = f(s1; T1); (s2; T2); :::g. We search the sub string among the children. lookuppattern(T; s) = find(C; s) (6.11) If children C is empty, it means the sub string doesn't occurs at all. Oth-erwise, we examine the

663. rst pair (s1; T1), if s is pre

664. x of s1, then the number of sub-trees in T1 is the result. If s1 is pre

665. x of s, we remove s1 from s, and recursively look up it in T1; otherwise, we go on to examine the rest children denoted as C0. find(C; s) = 8 : 0 : C = max(1; jC1j) : s @ s1 lookuppattern(T1; s s1) : s1 @ s find(C0; s) : otherwise (6.12) The following Haskell example code implements this algorithm. lookupPattern (Br lst) ptn = find lst where find [] = 0 find ((s, t):xs) j ptn `isPrefixOf` s = numberOfBranch t j s `isPrefixOf` ptn = lookupPattern t (drop (length s) ptn) j otherwise = find xs numberOfBranch (Br ys) = length ys numberOfBranch _ = 1 findPattern s ptn = lookupPattern (suffixTree $ s++$) ptn We always append special terminator to the string (the `$' in above pro-gram), so that there won't be any sux becomes the pre

666. x of the other[3]. Sux tree also supports searching pattern like a**n, we skip it here. Read-ers can refer to [13] and [14] for details.

667. 158 CHAPTER 6. SUFFIX TREE 6.4.2 Find the longest repeated sub-string After adding a special terminator character to string S, The longest repeated sub-string can be found by searching the deepest branches in sux tree. Consider the example sux tree shown in

668. gure 6.11 $ i mississippi$ p s $ ppi$ ssi A ppi$ ssippi$ i$ pi$ i B C si ppi$ ssippi$ ppi$ ssippi$ Figure 6.11: The sux tree for `mississippi$' There are three branch nodes, A, B, and C with depth 3. However, A represents the longest repeated sub-string issi. B and C represent for si, ssi, they are shorter than A. This example tells us that the depth of the branch node should be mea-sured by the number of characters traversed from the root. But not the number of explicit branch nodes. To

669. nd the longest repeated sub-string, we can perform BFS in the sux tree. 1: function Longest-Repeated-Substring(T) 2: Q (NIL, Root(T)) 3: R NIL 4: while Q is not empty do 5: (s; T) Pop(Q) 6: for each ((l; r); T0) 2 Children(T) do 7: if T0 is not leaf then 8: s0 Concatenate(s; (l; r)) 9: Push(Q; (s0; T0)) 10: R Update(R; s0) 11: return R This algorithm initializes a queue with a pair of an empty string and the root. Then it repeatedly examine the candidate in the queue. For each node, the algorithm examines each children one by one. If it is a branch node, the child is pushed back to the queue for further search. And the sub-string represented by this child will be treated as a candidate of the longest repeated sub-string. Function Update(R; s0) updates the longest repeated sub-string candidates. If multiple candidates have the same length, they are all kept in a result list.

670. 6.4. SUFFIX TREE APPLICATIONS 159 1: function Update(L; s) 2: if L = NIL _jl1j jsj then 3: return l fsg 4: if jl1j = jsj then 5: return Append(L; s) 6: return L The above algorithm can be implemented in Python as the following example program. def lrs(t): queue = [(, t.root)] res = [] while len(queue)0: (s, node) = queue.pop(0) for _, (str_ref, tr) in node.children.items(): if len(tr.children)0: s1 = s+t.substr(str_ref) queue.append((s1, tr)) res = update_max(res, s1) return res def update_max(lst, x): if lst ==[] or len(lst[0]) len(x): return [x] if len(lst[0]) == len(x): return lst + [x] return lst Searching the deepest branch can also be realized recursively. If the tree is just a leaf node, empty string is returned, else the algorithm tries to

671. nd the longest repeated sub-string from the children. LRS(T) = : leaf(T) longest(fsi [ LRS(Ti)j(si; Ti) 2 C; :leaf(Ti)g) : otherwise (6.13) The following Haskell example program implements the longest repeated sub-string algorithm. isLeaf Lf = True isLeaf _ = False lrs'::Tr!String lrs' Lf = lrs' (Br lst) = find $ filter (not isLeaf snd) lst where find [] = find ((s, t):xs) = maximumBy (compare `on` length) [s++(lrs' t), find xs] 6.4.3 Find the longest common sub-string The longest common sub-string, can also be quickly found with sux tree. The solution is to build a generalized sux tree. If the two strings are denoted as txt1 and txt2, a generalized sux tree is SuffixT ree(txt1$1txt2$2). Where

672. 160 CHAPTER 6. SUFFIX TREE $1 is a special terminator character for txt1, and $26= $1 is another special terminator character for txt2. The longest common sub-string is indicated by the deepest branch node, with two forks corresponding to both ...$1... and ...$2(no $1). The de

673. nition of the deepest node is as same as the one for the longest repeated sub-string, it is the number of characters traversed from root. If a node has ...$1... under it, the node must represent a sub-string of txt1, as $1 is the terminator of txt1. On the other hand, since it also has ...$2 (without $1), this node must represent a sub-string of txt2 too. Because it's the deepest one satis

674. ed this criteria, so the node represents the longest common sub-string. Again, we can use BFS (bread

675. rst search) to

676. nd the longest common sub-string. 1: function Longest-Common-Substring(T) 2: Q (NIL, Root(T)) 3: R NIL 4: while Q is not empty do 5: (s; T) POP(Q) 6: if Match-Fork(T) then 7: R Update(R; s) 8: for each ((l; r); T0) 2 Children(T) do 9: if T0 is not leaf then 10: s0 Concatenate(s; (l; r)) 11: Push(Q; (s0; T0)) 12: return R Most part is as same as the the longest repeated sub-sting searching al-gorithm. The function Match-Fork checks if the children satisfy the common sub-string criteria. 1: function Match-Fork(T) 2: if j Children(T) j = 2 then 3: f(s1; T1); (s2; T2)g Children(T) 4: return T1 is leaf ^T2 is leaf ^ Xor($1 2 s1; $1 2 s2)) 5: return FALSE In this function, it checks if the two children are both leaf. One contains $2, while the other doesn't. This is because if one child is a leaf, it always contains $1 according to the de

677. nition of sux tree. The following Python program implement the longest common sub-string program. def lcs(t): queue = [(, t.root)] res = [] while len(queue)0: (s, node) = queue.pop(0) if match_fork(t, node): res = update_max(res, s) for _, (str_ref, tr) in node.children.items(): if len(tr.children)0: s1 = s + t.substr(str_ref) queue.append((s1, tr))

678. 6.4. SUFFIX TREE APPLICATIONS 161 return res def is_leaf(node): return node.children=={} def match_fork(t, node): if len(node.children)==2: [(_, (str_ref1, tr1)), (_, (str_ref2, tr2))]=node.children.items() return is_leaf(tr1) and is_leaf(tr2) and (t.substr(str_ref1).find('#')!=-1) != (t.substr(str_ref2).find('#')!=-1) return False The longest common sub-string

679. nding algorithm can also be realized recur-sively. If the sux tree T is a leaf, the result is empty; Otherwise, we examine all children in T. For those satisfy the matching criteria, the sub-string are collected as candidates; for those don't matching, we recursively search the common sub-string among the children. The longest candidate is selected as the

680. nal result. LCS(T) = 8 : : leaf(T) longest( fsij(si; Ti) 2 C; match(Ti)g[ fsi [ LCS(Ti)j(si; Ti) 2 C;:match(Ti)g) : otherwise (6.14) The following Haskell example program implements the longest common sub-string algorithm. lcs Lf = [] lcs (Br lst) = find $ filter (not isLeaf snd) lst where find [] = [] find ((s, t):xs) = maxBy (compare òn` length) (if match t then s:(find xs) else (map (s++) (lcs t)) ++ (find xs)) match (Br [(s1, Lf), (s2, Lf)]) = (# ìsInfixOf` s1) == (# ìsInfixOf` s2) match _ = False 6.4.4 Find the longest palindrome A palindrome is a string, S, such that S = reverse(S) For example, level, rotator, civic are all palindrome. The longest palindrome in a string s1s2:::sn can be found in O(n) time with sux tree. The solution can be bene

681. t from the longest common sub-string algorithm. For string S, if sub-string w is a palindrome, then it must be sub-string of reverse(S) too. for instance, issi is a palindrome, it is a sub-string of mississippi. When reverse to ippississim, issi is also a sub-string. Based on this fact, we can

682. nd the longest palindrome by searching the longest common sub-string for S and reverse(S). palindromem(S) = LCS(suffixT ree(S [ reverse(S))) (6.15)

683. 162 CHAPTER 6. SUFFIX TREE The following Haskell example program

684. nds the longest palindrome. longestPalindromes s = lcs $ suffixTree (s++#++(reverse s)++$) 6.4.5 Others Sux tree can also be used for data compression, such as Burrows-Wheeler transform, LZW compression (LZSS) etc. [3] 6.5 Notes and short summary Sux Tree was

685. rst introduced by Weiner in 1973 [?]. In 1976, McCreight greatly simpli

686. ed the construction algorithm. McCreight constructs the sux tree from right to left. In 1995, Ukkonen gave the

687. rst on-line construction algorithms from left to right. All the three algorithms are linear time (O(n)). And some research shows the relationship among these 3 algorithms. [7]

688. Bibliography [1] Esko Ukkonen. On-line construction of sux trees. Al-gorithmica 14 (3): 249{260. doi:10.1007/BF01206331. http://www.cs.helsinki.

689. /u/ukkonen/SuxT1withFigs.pdf [2] Weiner, P. Linear pattern matching algorithms, 14th Annual IEEE Symposium on Switching and Automata Theory, pp. 1C11, doi:10.1109/SWAT.1973.13 [3] Sux Tree, Wikipedia. http://en.wikipedia.org/wiki/Sux tree [4] Esko Ukkonen. Sux tree and sux array techniques for pattern analysis in strings. http://www.cs.helsinki.

690. /u/ukkonen/Erice2005.ppt [5] Trie, Wikipedia. http://en.wikipedia.org/wiki/Trie [6] Sux Tree (Java). http://en.literateprograms.org/Sux tree (Java) [7] Robert Giegerich and Stefan Kurtz. From Ukkonen to McCreight and Weiner: A Unifying View of Linear-Time Sux Tree Con-struction. Science of Computer Programming 25(2-3):187-218, 1995. http://citeseer.ist.psu.edu/giegerich95comparison.html [8] Robert Giegerich and Stefan Kurtz. A Comparison of Imper-ative and Purely Functional Sux Tree Constructions. Algo-rithmica 19 (3): 331{353. doi:10.1007/PL00009177. www.zbh.uni-hamburg. de/pubs/pdf/GieKur1997.pdf [9] Bryan O'Sullivan. suxtree: Ecient, lazy sux tree implementation. http://hackage.haskell.org/package/suxtree [10] Danny. http://hkn.eecs.berkeley.edu/ dyoo/plt/suxtree/ [11] Zhang Shaojie. Lecture of Sux Trees. http://www.cs.ucf.edu/ shzhang/Combio09/lec3.pdf [12] Lloyd Allison. Sux Trees. http://www.allisons.org/ll/AlgDS/Tree/Sux/ [13] Esko Ukkonen. Sux tree and sux array techniques for pattern analysis in strings. http://www.cs.helsinki.

691. /u/ukkonen/Erice2005.ppt [14] Esko Ukkonen Approximate string-matching over sux trees. Proc. CPM 93. Lecture Notes in Computer Science 684, pp. 228-242, Springer 1993. http://www.cs.helsinki.

692. /u/ukkonen/cpm931.ps 163

693. 164 B-Trees

694. Chapter 7 B-Trees 7.1 Introduction B-Tree is important data structure. It is widely used in modern

695. le systems. Some are implemented based on B+ tree, which is extended from B-tree. B-tree is also widely used in database systems. Some textbooks introduce B-tree with the the problem of how to access a large block of data on magnetic disks or secondary storage devices[2]. It is also helpful to understand B-tree as a generalization of balanced binary search tree[2]. Refer to the Figure 7.1, It is easy to

696. nd the dierence and similarity of B-tree regarding to binary search tree. M C G P T W A B D E F H I J K N O Q R S U V X Y Z Figure 7.1: Example B-Tree Remind the de

697. nition of binary search tree. A binary search tree is either an empty node; or a node contains 3 parts, a value, a left child and a right child. Both children are also binary search trees. The binary search tree satis

698. es the constraint that. all the values on the left child are not greater than the value of of this node; the value of this node is not greater than any values on the right child. 165

699. 166 CHAPTER 7. B-TREES For non-empty binary tree (L; k;R), where L, R and k are the left, right chil-dren, and the key. Function Key(T) accesses the key of tree T. The constraint can be represented as the following. 8x 2 L; 8y 2 R ) Key(x) k Key(y) (7.1) If we extend this de

700. nition to allow multiple keys and children, we get the B-tree de

701. nition. A B-tree is either empty; or contains n keys, and n + 1 children, each child is also a B-Tree, we denote these keys and children as k1; k2; :::; kn and c1; c2; :::; cn; cn+1. Figure 7.2 illustrates a B-Tree node. C[1] K[1] C[2] K[2] ... C[n] K[n] C[n+1] Figure 7.2: A B-Tree node The keys and children in a node satisfy the following order constraints. Keys are stored in non-decreasing order. that k1 k2 ::: kn; for each ki, all elements stored in child ci are not greater than ki, while ki is not greater than any values stored in child ci+1. The constraints can be represented as in equation (7.2) as well. 8xi 2 ci; i = 0; 1; :::; n;) x1 k1 x2 k2 ::: xn kn xn+1 (7.2) Finally, after adding some constraints to make the tree balanced, we get the complete B-tree de

702. nition. All leaves have the same depth; We de

703. ne integral number, t, as the minimum degree of B-tree; { each node can have at most 2t 1 keys; { each node can have at least t 1 keys, except the root; Consider a B-tree holds n keys. The minimum degree t 2. The height is h. All the nodes have at least t 1 keys except the root. The root contains at least 1 key. There are at least 2 nodes at depth 1, at least 2t nodes at depth 2, at least 2t2 nodes at depth 3, ...,

704. nally, there are at least 2th1 nodes at depth h. Times all nodes with t 1 except for root, the total number of keys satis

705. es the following inequality. n 1 + (t 1)(2 + 2t + 2t2 + ::: + 2th1) = 1 + 2(t 1) hX1 k=0 tk = 1 + 2(t 1) th 1 t 1 = 2th 1 (7.3)

706. 7.2. INSERTION 167 Thus we have the inequality between the height and the number of keys. h logt n + 1 2 (7.4) This is the reason why B-tree is balanced. The simplest B-tree is so called 2-3-4 tree, where t = 2, that every node except root contains 2 or 3 or 4 keys. red-black tree can be mapped to 2-3-4 tree essentially. The following Python code shows example B-tree de

707. nition. It explicitly pass t when create a node. class BTree: def __init__(self, t): self.t = t self.keys = [] self.children = [] B-tree nodes commonly have satellite data as well. We ignore satellite data for illustration purpose. In this chapter, we will

708. rstly introduce how to generate B-tree by insertion. Two dierent methods will be explained. One is the classic method as in [2], that we split the node before insertion if it's full; the other is the modify-

709. x approach which is quite similar to the red-black tree solution [3] [2]. We will next explain how to delete key from B-tree and how to look up a key. 7.2 Insertion B-tree can be created by inserting keys repeatedly. The basic idea is similar to the binary search tree. When insert key x, from the tree root, we examine all the keys in the node to

710. nd a position where all the keys on the left are less than x, while all the keys on the right are greater than x. If the current node is a leaf node, and it is not full (there are less then 2t 1 keys in this node), x will be insert at this position. Otherwise, the position points to a child node. We need recursively insert x to it. Figure 7.3 shows one example. The B-tree illustrated is 2-3-4 tree. When insert key x = 22, because it's greater than the root, the right child contains key 26, 38, 45 is examined next; Since 22 26, the

711. rst child contains key 21 and 25 are examined. This is a leaf node, and it is not full, key 22 is inserted to this node. However, if there are 2t1 keys in the leaf, the new key x can't be inserted, because this node is 'full'. When try to insert key 18 to the above example B-tree will meet this problem. There are 2 methods to solve it. 7.2.1 Splitting Split before insertion If the node is full, one method to solve the problem is to split to node before insertion. For a node with at1 keys, it can be divided into 3 parts as shown in Figure 7.4. the left part contains the

712. rst t 1 keys and t children. The right part contains the rest t 1 keys and t children. Both left part and right part are

713. 168 CHAPTER 7. B-TREES 2 0 4 1 1 2 6 3 8 4 5 1 2 5 8 9 1 2 1 5 1 6 1 7 2 1 2 5 3 0 3 1 3 7 4 0 4 2 4 6 4 7 5 0 (a) Insert key 22 to the 2-3-4 tree. 22 20, go to the right child; 22 26 go to the

714. rst child. 2 0 4 1 1 2 6 3 8 4 5 1 2 5 8 9 1 2 1 5 1 6 1 7 2 1 2 2 2 5 3 0 3 1 3 7 4 0 4 2 4 6 4 7 5 0 (b) 21 22 25, and the leaf isn't full. Figure 7.3: Insertion is similar to binary search tree. valid B-tree nodes. the middle part is the t-th key. We can push it up to the parent node (if the current node is root, then the this key, with the two children will be the new root). For node x, denote K(x) as keys, C(x) as children. The i-th key as ki(x), the j-th child as cj(x). Below algorithm describes how to split the i-th child for a given node. 1: procedure Split-Child(node; i) 2: x ci(node) 3: y CREATE-NODE 4: Insert(K(node); i; kt(x)) 5: Insert(C(node); i + 1; y) 6: K(y) fkt+1(x); kt+2(x); :::; k2t1(x)g 7: K(x) fk1(x); k2(x); :::; kt1(x)g 8: if y is not leaf then 9: C(y) fct+1(x); ct+2(x); :::; c2t(x)g 10: C(x) fc1(x); c2(x); :::; ct(x)g The following example Python program implements this child splitting al-gorithm. def split_child(node, i): t = node.t x = node.children[i] y = BTree(t) node.keys.insert(i, x.keys[t-1]) node.children.insert(i+1, y) y.keys = x.keys[t:] x.keys = x.keys[:t-1]

715. 7.2. INSERTION 169 K[1] K[2] ... K[t] ... K[2t-1] C[1] C[2] ... C[t] C[t+1] ... C[2t-1] C[2t] (a) Before split ... K[t] ... K[1] K[2] ... K[t-1] C[1] C[2] ... C[t] K[t+1] ... K[2t-1] C[t+1] ... C[2t-1] (b) After split Figure 7.4: Split node if not is_leaf(x): y.children = x.children[t:] x.children = x.children[:t] Where function is leaf test if a node is leaf. def is_leaf(t): return t.children == [] After splitting, a key is pushed up to its parent node. It is quite possible that the parent node has already been full. And this pushing violates the B-tree property. In order to solve this problem, we can check from the root along the path of insertion traversing till the leaf. If there is any node in this path is full, the splitting is applied. Since the parent of this node has been examined, it is ensured that there are less than 2t 1 keys in the parent. It won't make the parent full if pushing up one key. This approach only need one single pass down the tree without any back-tracking. If the root need splitting, a new node is created as the new root. There is no keys in this new created root, and the previous root is set as the only child. After that, splitting is performed top-down. And we can insert the new key

716. nally. 1: function Insert(T; k) 2: r T 3: if r is full then . root is full 4: s CREATE-NODE 5: C(s) frg 6: Split-Child(s; 1) 7: r s 8: return Insert-Nonfull(r; k) Where algorithm Insert-Nonfull assumes the node passed in is not full.

717. 170 CHAPTER 7. B-TREES If it is a leaf node, the new key is inserted to the proper position based on the order; Otherwise, the algorithm

718. nds a proper child node to which the new key will be inserted. If this child is full, splitting will be performed. 1: function Insert-Nonfull(T; k) 2: if T is leaf then 3: i 1 4: while i jK(T)j ^ k ki(T) do 5: i i + 1 6: Insert(K(T); i; k) 7: else 8: i jK(T)j 9: while i 1 ^ k ki(T) do 10: i i 1 11: if ci(T) is full then 12: Split-Child(T; i) 13: if k ki(T) then 14: i i + 1 15: Insert-Nonfull(ci(T); k) 16: return T This algorithm is recursive. In B-tree, the minimum degree t is typically relative to magnetic disk structure. Even small depth can support huge amount of data (with t = 10, maximum to 10 billion data can be stored in a B-tree with height of 10). The recursion can also be eliminated. This is left as exercise to the reader. Figure 7.5 shows the result of continuously inserting keys G, M, P, X, A, C, D, E, J, K, N, O, R, S, T, U, V, Y, Z to the empty tree. The

719. rst result is the 2-3-4 tree (t = 2). The second result shows how it varies when t = 3. E P C M S U X A D G J K N O R T V Y Z (a) 2-3-4 tree. D M P T A C E G J K N O R S U V X Y Z (b) t = 3 Figure 7.5: Insertion result Below example Python program implements this algorithm. def insert(tr, key):

720. 7.2. INSERTION 171 root = tr if is_full(root): s = BTree(root.t) s.children.insert(0, root) split_child(s, 0) root = s return insert_nonfull(root, key) And the insertion to non-full node is implemented as the following. def insert_nonfull(tr, key): if is_leaf(tr): ordered_insert(tr.keys, key) else: i = len(tr.keys) while i0 and key tr.keys[i-1]: i = i-1 if is_full(tr.children[i]): split_child(tr, i) if keytr.keys[i]: i = i+1 insert_nonfull(tr.children[i], key) return tr Where function ordered insert is used to insert an element to an ordered list. Function is full tests if a node contains 2t 1 keys. def ordered_insert(lst, x): i = len(lst) lst.append(x) while i0 and lst[i]lst[i-1]: (lst[i-1], lst[i]) = (lst[i], lst[i-1]) i=i-1 def is_full(node): return len(node.keys) 2 node.t - 1 For the array based collection, append on the tail is much more eective than insert in other position, because the later takes O(n) time, if the length of the collection is n. The ordered insert program

721. rstly appends the new element at the end of the existing collection, then iterates from the last element to the

722. rst one, and checks if the current two elements next to each other are ordered. If not, these two elements will be swapped. Insert then

723. xing In functional settings, B-tree insertion can be realized in a way similar to red-black tree. When insert a key to red-black tree, it is

724. rstly inserted as in the normal binary search tree, then recursive

725. xing is performed to resume the balance of the tree. B-tree can be viewed as extension to the binary search tree, that each node contains multiple keys and children. We can

726. rstly insert the key without considering if the node is full. Then perform

727. xing to satisfy the minimum degree constraint. insert(T; k) = fix(ins(T; k)) (7.5)

728. 172 CHAPTER 7. B-TREES Function ins(T; k) traverse the B-tree T from root to

729. nd a proper position where key k can be inserted. After that, function fix is applied to resume the B-tree properties. Denote B-tree in a form of T = (K;C; t), where K represents keys, C represents children, and t is the minimum degree. Below is the Haskell de

730. nition of B-tree. data BTree a = Node{ keys :: [a] , children :: [BTree a] , degree :: Int} deriving (Eq) The insertion function can be provided based on this de

731. nition. insert tr x = fixRoot $ ins tr x There are two cases when realize ins(T; k) function. If the tree T is leaf, k is inserted to the keys; Otherwise if T is the branch node, we need recursively insert k to the proper child. Figure 7.6 shows the branch case. The algorithm

732. rst locates the position. for certain key ki, if the new key k to be inserted satisfy ki1 k ki, Then we need recursively insert k to child ci. This position divides the node into 3 parts, the left part, the child ci and the right part. k, K[i-1]kK[i] insert to K[1] K[2] ... K[i-1] K[i] ... K[n] C[1] C[2] ... C[i-1] C[i] C[i+1] ... C[n] C[n+1] (a) Locate the child to insert. K[1] K[2] ... K[i-1] C[1] C[2] ... C[i-1] k, K[i-1]kK[i] recursive insert C[i] K[i] K[i+1] ... K[n] C[i+1] ... C[n+1] (b) Recursive insert. Figure 7.6: Insert a key to a branch node ins(T; k) = (K0 [ fkg [ K00;; t) : C = ; (K0;K00) = divide(K; k) make((K0;C1); ins(c; k); (K00;C0 2)) : (C1;C2) = split(jK0j;C) (7.6) The

733. rst clause deals with the leaf case. Function divide(K; k) divide keys into two parts, all keys in the

734. rst part are not greater than k, and all rest keys are not less than k.

735. 7.2. INSERTION 173 K = K 0 [ K 00 ^ 8k 0 2 K; k 00 2 K 00 ) k 0 k k 00 The second clause handle the branch case. Function split(n;C) splits chil-dren in two parts, C1 and C2. C1 contains the

736. rst n children; and C2 contains the rest. Among C2, the

737. rst child is denoted as c, and others are represented as C0 2. Here the key k need be recursively inserted into child c. Function make takes 3 parameter. The

738. rst and the third are pairs of key and children; the second parameter is a child node. It examine if a B-tree node made from these keys and children violates the minimum degree constraint and performs

739. xing if necessary. make((K 0 ;C 0 ); c; (K 00 ;C 00 )) = fixF ull((K0;C0); c; (K00;C00)) : full(c) (K0 [ K00;C0 [ fcg [ C00; t) : otherwise (7.7) Where function full(c) tests if the child c is full. Function fixF ull splits the the child c, and forms a new B-tree node with the pushed up key. fixF ull((K 0 ;C 0 ); c; (K 00 ;C 00 )) = (K 0 [ fkg [ K 00 ;C 0 [ fc1; c2g [ C 00 ; t) (7.8) Where (c1; k; c2) = split(c). During splitting, the

740. rst t 1 keys and t children are extract to one new child, the last t 1 keys and t children form another child. The t-th key is pushed up. With all the above functions de

741. ned, we can realize fix(T) to complete the functional B-tree insertion algorithm. It

742. rstly checks if the root contains too many keys. If it exceeds the limit, splitting will be applied. The split result will be used to make a new node, so the total height of the tree increases by one. fix(T) = 8 : c : T = (; fcg; t) (fkg; fc1; c2g; t) : full(T); (c1; k; c2) = split(T) T : otherwise (7.9) The following Haskell example code implements the B-tree insertion. import qualified Data.List as L ins (Node ks [] t) x = Node (L.insert x ks) [] t ins (Node ks cs t) x = make (ks', cs') (ins c x) (ks'', cs'') where (ks', ks'') = L.partition (x) ks (cs', (c:cs'')) = L.splitAt (length ks') cs fixRoot (Node [] [tr] _) = tr -- shrink height fixRoot tr = if full tr then Node [k] [c1, c2] (degree tr) else tr where (c1, k, c2) = split tr make (ks', cs') c (ks'', cs'') j full c = fixFull (ks', cs') c (ks'', cs'')

743. 174 CHAPTER 7. B-TREES j otherwise = Node (ks'++ks'') (cs'++[c]++cs'') (degree c) fixFull (ks', cs') c (ks'', cs'') = Node (ks'++[k]++ks'') (cs'++[c1,c2]++cs'') (degree c) where (c1, k, c2) = split c full tr = (length $ keys tr) 2(degree tr)-1 Figure 7.7 shows the varies of results of building B-trees by continuously inserting keys GMPXACDEJKNORSTUVYZ. E O C M R T V A D G J K N P S U X Y Z (a) Insert result of a 2-3-4 tree. G M P T A C D E J K N O R S U V X Y Z (b) Insert result of a B-tree with t = 3 Figure 7.7: Insert then

744. xing results Compare to the imperative insertion result as shown in

745. gure 7.7 we can found that there are dierent. However, they are all valid because all B-tree properties are satis

746. ed. 7.3 Deletion Deleting a key from B-tree may violate balance properties. Except the root, a node shouldn't contain too few keys less than t 1, where t is the minimum degree. Similar to the approaches for insertion, we can either do some preparation so that the node from where the key being deleted contains enough keys; or do some

747. xing after the deletion if the node has too few keys. 7.3.1 Merge before delete method We start from the easiest case. If the key k to be deleted can be located in node x, and x is a leaf node, we can directly remove k from x. If x is the root (the only node of the tree), we needn't worry about there are too few keys after deletion. This case is named as case 1 later.

748. 7.3. DELETION 175 In most cases, we start from the root, along a path to locate where is the node contains k. If k can be located in the internal node x, there are three sub cases. Case 2a, If the child y precedes k contains enough keys (more than t). We replace k in node x with k0, which is the predecessor of k in child y. And recursively remove k0 from y. The predecessor of k can be easily located as the last key of child y. This is shown in

749. gure 7.8. Figure 7.8: Replace and delete from predecessor. Case 2b, If y doesn't contain enough keys, while the child z follows k contains more than t keys. We replace k in node x with k00, which is the successor of k in child z. And recursively remove k00 from z. The successor of k can be easily located as the

750. rst key of child z. This sub-case is illustrated in

751. gure 7.9. Case 2c, Otherwise, if neither y, nor z contains enough keys, we can merge y, k and z into one new node, so that this new node contains 2t 1 keys. After that, we can then recursively do the removing. Note that after merge, if the current node doesn't contain any keys, which means k is the only key in x. y and z are the only two children of x. we need shrink the tree height by one. Figure 7.10 illustrates this sub-case. the last case states that, if k can't be located in node x, the algorithm need

752. nd a child node ci in x, so that the sub-tree ci contains k. Before the deletion is recursively applied in ci, we need make sure that there are at least t keys in ci. If there are not enough keys, the following adjustment is performed.

753. 176 CHAPTER 7. B-TREES Figure 7.9: Replace and delete from successor. Figure 7.10: Merge and delete.

754. 7.3. DELETION 177 Case 3a, We check the two sibling of ci, which are ci1 and ci+1. If either one contains enough keys (at least t keys), we move one key from x down to ci, and move one key from the sibling up to x. Also we need move the relative child from the sibling to ci. This operation makes ci contains enough keys for deletion. we can next try to delete k from ci recursively. Figure 7.11 illustrates this case. Figure 7.11: Borrow from the right sibling. Case 3b, In case neither one of the two siblings contains enough keys, we then merge ci, a key from x, and either one of the sibling into a new node. Then do the deletion on this new node. Figure 7.12 shows this case. Before de

755. ne the B-tree delete algorithm, we need provide some auxiliary functions. Function Can-Del tests if a node contains enough keys for deletion. 1: function Can-Del(T) 2: return jK(T)j t Procedure Merge-Children(T; i) merges child ci(T), key ki(T), and child ci+1(T) into one big node. 1: procedure Merge-Children(T; i) . Merge ci(T), ki(T), and ci+1(T) 2: x ci(T) 3: y ci+1(T) 4: K(x) K(x) [ fki(T)g [ K(y) 5: C(x) C(x) [ C(y) 6: Remove-At(K(T); i)

756. 178 CHAPTER 7. B-TREES Figure 7.12: Merge ci, k, and ci+1 to a new node. 7: Remove-At(C(T); i + 1) Procedure Merge-Children merges the i-th child, the i-th key, and i + 1- th child of node T into a new child, and remove the i-th key and i + 1-th child from T after merging. With these functions de

757. ned, the B-tree deletion algorithm can be given by realizing the above 3 cases. 1: function Delete(T; k) 2: i 1 3: while i jK(T)j do 4: if k = ki(T) then 5: if T is leaf then . case 1 6: Remove(K(T); k) 7: else . case 2 8: if Can-Del(ci(T)) then . case 2a 9: ki(T) Last-Key(ci(T)) 10: Delete(ci(T); ki(T)) 11: else if Can-Del(ci+1(T)) then . case 2b 12: ki(T) First-Key(ci+1(T)) 13: Delete(ci+1(T); ki(T)) 14: else . case 2c 15: Merge-Children(T; i) 16: Delete(ci(T); k) 17: if K(T) = NIL then 18: T ci(T) . Shrinks height

758. 7.3. DELETION 179 19: return T 20: else if k ki(T) then 21: Break 22: else 23: i i + 1 24: if T is leaf then 25: return T . k doesn't exist in T. 26: if : Can-Del(ci(T)) then . case 3 27: if i 1^ Can-Del(ci1(T)) then . case 3a: left sibling 28: Insert(K(ci(T)); ki1(T)) 29: ki1(T) Pop-Back(K(ci1(T))) 30: if ci(T) isn't leaf then 31: c Pop-Back(C(ci1(T))) 32: Insert(C(ci(T)); c) 33: else if i jC(T)j^ Can-Del(ci1 (T)) then . case 3a: right sibling 34: Append(K(ci(T)); ki(T)) 35: ki(T) Pop-Front(K(ci+1(T))) 36: if ci(T) isn't leaf then 37: c Pop-Front(C(ci+1(T))) 38: Append(C(ci(T)); c) 39: else . case 3b 40: if i 1 then 41: Merge-Children(T; i 1) 42: else 43: Merge-Children(T; i) 44: Delete(ci(T); k) . recursive delete 45: if K(T) = NIL then . Shrinks height 46: T c1(T) 47: return T Figure 7.13, 7.14, and 7.15 show the deleting process step by step. The nodes modi

759. ed are shaded. The following example Python program implements the B-tree deletion al-gorithm. def can_remove(tr): return len(tr.keys) tr.t def replace_key(tr, i, k): tr.keys[i] = k return k def merge_children(tr, i): tr.children[i].keys += [tr.keys[i]] + tr.children[i+1].keys tr.children[i].children += tr.children[i+1].children tr.keys.pop(i) tr.children.pop(i+1) def B_tree_delete(tr, key): i = len(tr.keys)

760. 180 CHAPTER 7. B-TREES P C G M T X A B D E F J K L N O Q R S U V Y Z (a) A B-tree before deleting. P C G M T X A B D E J K L N O Q R S U V Y Z (b) After delete key 'F', case 1. Figure 7.13: Result of B-tree deleting (1). P C G L T X A B D E J K N O Q R S U V Y Z (a) After delete key 'M', case 2a. P C L T X A B D E J K N O Q R S U V Y Z (b) After delete key 'G', case 2c. Figure 7.14: Result of B-tree deleting program (2)

761. 7.3. DELETION 181 C L P T X A B E J K N O Q R S U V Y Z (a) After delete key 'D', case 3b, and height is shrunk. E L P T X A C J K N O Q R S U V Y Z (b) After delete key 'B', case 3a, borrow from right sibling. E L P S X A C J K N O Q R T V Y Z (c) After delete key 'U', case 3a, borrow from left sibling. Figure 7.15: Result of B-tree deleting program (3) while i0: if key == tr.keys[i-1]: if tr.leaf: # case 1 in CLRS tr.keys.remove(key) else: # case 2 in CLRS if tr.children[i-1].can_remove(): # case 2a key = tr.replace_key(i-1, tr.children[i-1].keys[-1]) B_tree_delete(tr.children[i-1], key) elif tr.children[i].can_remove(): # case 2b key = tr.replace_key(i-1, tr.children[i].keys[0]) B_tree_delete(tr.children[i], key) else: # case 2c tr.merge_children(i-1) B_tree_delete(tr.children[i-1], key) if tr.keys==[]: # tree shrinks in height tr = tr.children[i-1] return tr elif key tr.keys[i-1]: break else: i = i-1 # case 3 if tr.leaf: return tr #key doesn't exist at all if not tr.children[i].can_remove(): if i0 and tr.children[i-1].can_remove(): #left sibling tr.children[i].keys.insert(0, tr.keys[i-1]) tr.keys[i-1] = tr.children[i-1].keys.pop()

762. 182 CHAPTER 7. B-TREES if not tr.children[i].leaf: tr.children[i].children.insert(0, tr.children[i-1].children.pop()) elif ilen(tr.children) and tr.children[i+1].can_remove(): #right sibling tr.children[i].keys.append(tr.keys[i]) tr.keys[i]=tr.children[i+1].keys.pop(0) if not tr.children[i].leaf: tr.children[i].children.append(tr.children[i+1].children.pop(0)) else: # case 3b if i0: tr.merge_children(i-1) else: tr.merge_children(i) B_tree_delete(tr.children[i], key) if tr.keys==[]: # tree shrinks in height tr = tr.children[0] return tr 7.3.2 Delete and

763. x method The merge and delete algorithm is a bit complex. There are several cases, and in each case, there are sub cases to deal. Another approach to design the deleting algorithm is to perform

764. xing after deletion. It is similar to the insert-then-

765. x strategy. delete(T; k) = fix(del(T; k)) (7.10) When delete a key from B-tree, we

766. rstly locate which node this key is contained. We traverse from the root to the leaves till

767. nd this key in some node. If this node is a leaf, we can remove the key, and then examine if the deletion makes the node contains too few keys to satisfy the B-tree balance properties. If it is a branch node, removing the key breaks the node into two parts. We need merge them together. The merging is a recursive process which is shown in

768. gure 7.16. When do merging, if the two nodes are not leaves, we merge the keys to-gether, and recursively merge the last child of the left part and the

769. rst child of the right part to one new node. Otherwise, if they are leaves, we merely put all keys together. Till now, the deleting in performed in straightforward way. However, delet-ing decreases the number of keys of a node, and it may result in violating the B-tree balance properties. The solution is to perform

770. xing along the path traversed from root. During the recursive deletion, the branch node is broken into 3 parts. The left part contains all keys less than k, includes k1; k2; :::; ki1, and children c1; c2; :::; ci1, the right part contains all keys greater than k, say ki; ki+1; :::; kn+1, and children ci+1; ci+2; :::; cn+1. Then key k is recursively deleted from child ci. Denote the result becomes c0 i after that. We need make a new node from these 3 parts, as shown in

771. gure 7.17. At this time point, we need examine if c0 i contains enough keys. If there are to less keys (less than t 1, but not t in contrast to the merge-and-delete approach), we can either borrow a key-child pair from the left or the right part,

772. 7.3. DELETION 183 Figure 7.16: Delete a key from a branch node. Removing ki breaks the node into 2 parts. Merging these 2 parts is a recursive process. When the two parts are leaves, the merging terminates. Figure 7.17: After delete key k from node ci, denote the result as c0 i. The

773. xing makes a new node from the left part, c0 i and the right part.

774. 184 CHAPTER 7. B-TREES and do inverse operation of splitting. Figure 7.18 shows example of borrowing from the left part. Figure 7.18: Borrow a key-child pair from left part and un-split to a new child. If both left part and right part are empty, we can simply push c0 i up. Denote the B-tree as T = (K;C; t), where K and C are keys and children. The del(T; k) function deletes key k from the tree. del(T; k) = 8 : (delete(K; k);; t) : C = merge((K1;C1; t); (K2;C2; t)) : ki = k make((K0 1;C0 1); del(c; k); (K0 2;C0 2)) : k =2 K (7.11) If children C = is empty, T is leaf. k is deleted from keys directly. Otherwise, T is internal node. If k 2 K, removing it separates the keys and children in two parts (K1;C1) and (K2;C2). They will be recursively merged. K1 = fk1; k2; :::; ki1g K2 = fki+1; ki+2; :::; kmg C1 = fc1; c2; :::; cig C2 = fci+1; ci+2; :::; cm+1g If k =2 K, we need locate a child c, and further delete k from it. (K0 1;K0 2) = (fk0jk0 2 K; k0 kg; fk0jk0 2 K; k k0g) (C0 1; fcg [ C0 2) = splitAt(jK0 1 j;C) The recursive merge function is de

775. ned as the following. When merge two trees T1 = (K1;C1; t) and T2 = (K2;C2; t), if both are leaves, we create a new leave by concatenating the keys. Otherwise, the last child in C1, and the

776. rst child in C2 are recursively merged. And we call make function to form the new tree. When C1 and C2 are not empty, denote the last child of C1 as c1;m, the

777. 7.3. DELETION 185 rest as C0 1; the

778. rst child of C2 as C2;1, the rest as C0 2. Below equation de

779. nes the merge function. merge(T1; T2) = (K1 [ K2;; t) : C1 = C2 = make((K1;C0 1); merge(c1;m; c2;1); (K2;C0 2)) : otherwise (7.12) The make function de

780. ned above only handles the case that a node contains too many keys due to insertion. When delete key, it may cause a node contains too few keys. We need test and

781. x this situation as well. make((K 0 ;C 0 ); c; (K 00 ;C 00 )) = 8 : fixF ull((K0;C0); c; (K00;C00)) : full(c) fixLow((K0;C0); c; (K00;C00)) : low(c) (K0 [ K00;C0 [ fcg [ C00; t) : otherwise (7.13) Where low(T) checks if there are too few keys less than t 1. Function fixLow(Pl; c; Pr) takes three arguments, the left pair of keys and children, a child node, and the right pair of keys and children. If the left part isn't empty, we borrow a pair of key-child, and do un-splitting to make the child contain enough keys, then recursively call make; If the right part isn't empty, we borrow a pair from the right; and if both sides are empty, we return the child node as result. In this case, the height of the tree shrinks. Denote the left part Pl = (Kl;Cl). If Kl isn't empty, the last key and child are represented as kl;m and cl;m respectively. The rest keys and children become K0 l and C0 l ; Similarly, the right part is denoted as Pr = (Kr;Cr). If Kr isn't empty, the

782. rst key and child are represented as kr;1, and cr;1. The rest keys and children are K0 r and C0 r. Below equation gives the de

783. nition of fixLow. fixLow(Pl; c; Pr) = 8 : make((K0 l ;C0 l ); unsplit(cl;m; kl;m; c); (Kr;Cr)) : Kl6= make((Kr;Cr); unsplit(c; kr;1; cr;1); (K0 r;C0 r)) : Kr6= c : otherwise (7.14) Function unsplit(T1; k; T2) is the inverse operation to splitting. It forms a new B-tree nodes from two small nodes and a key. unsplit(T1; k; T2) = (K1 [ fkg [ K2;C1 [ C2; t) (7.15) The following example Haskell program implements the B-tree deletion al-gorithm. import qualified Data.List as L delete tr x = fixRoot $ del tr x del:: (Ord a) ) BTree a ! a ! BTree a del (Node ks [] t) x = Node (L.delete x ks) [] t del (Node ks cs t) x = case L.elemIndex x ks of Just i ! merge (Node (take i ks) (take (i+1) cs) t) (Node (drop (i+1) ks) (drop (i+1) cs) t) Nothing ! make (ks', cs') (del c x) (ks'', cs'')

784. 186 CHAPTER 7. B-TREES where (ks', ks'') = L.partition (x) ks (cs', (c:cs'')) = L.splitAt (length ks') cs merge (Node ks [] t) (Node ks' [] _) = Node (ks++ks') [] t merge (Node ks cs t) (Node ks' cs' _) = make (ks, init cs) (merge (last cs) (head cs')) (ks', tail cs') make (ks', cs') c (ks'', cs'') j full c = fixFull (ks', cs') c (ks'', cs'') j low c = fixLow (ks', cs') c (ks'', cs'') j otherwise = Node (ks'++ks'') (cs'++[c]++cs'') (degree c) low tr = (length $ keys tr) (degree tr)-1 fixLow (ks'@(_:_), cs') c (ks'', cs'') = make (init ks', init cs') (unsplit (last cs') (last ks') c) (ks'', cs'') fixLow (ks', cs') c (ks''@(_:_), cs'') = make (ks', cs') (unsplit c (head ks'') (head cs'')) (tail ks'', tail cs'') fixLow _ c _ = c unsplit c1 k c2 = Node ((keys c1)++[k]++(keys c2)) ((children c1)++(children c2)) (degree c1) When delete the same keys from the B-tree as in merge and

785. xing approach, the results are dierent. However, both satisfy the B-tree properties, so they are all valid. M C G P T W A B D E F H I J K N O Q R S U V X Y Z (a) B-tree before deleting M C G P T W A B D F H I J K N O Q R S U V X Y Z (b) After delete key 'E'. Figure 7.19: Result of delete-then-

786. xing (1)

787. 7.4. SEARCHING 187 M C H P T W A B D F I J K N O Q R S U V X Y Z (a) After delete key 'G', H M P T W B C D F I J K N O Q R S U V X Y Z (b) After delete key 'A'. Figure 7.20: Result of delete-then-

788. xing (2) H P T W B C D F I J K N O Q R S U V X Y Z (a) After delete key 'M'. H P W B C D F I J K N O Q R S T V X Y Z (b) After delete key 'U'. Figure 7.21: Result of delete-then-

789. xing (3) 7.4 Searching Searching in B-tree can be considered as the generalized tree search extended from binary search tree. When searching in the binary tree, there are only 2 dierent directions, the left and the right. However, there are multiple directions in B-tree. 1: function Search(T; k) 2: loop 3: i 1 4: while i jK(T)j ^ k ki(T) do 5: i i + 1 6: if i jK(T)j ^ k = ki(T) then

790. 188 CHAPTER 7. B-TREES 7: return (T; i) 8: if T is leaf then 9: return NIL . k doesn't exist 10: else 11: T ci(T) Starts from the root, this program examines each key one by one from the smallest to the biggest. In case it

791. nds the matched key, it returns the current node and the index of this key. Otherwise, if it

792. nds the position i that ki k ki+1, the program will next search the child node ci+1 for the key. If it traverses to some leaf node, and fails to

793. nd the key, the empty value is returned to indicate that this key doesn't exist in the tree. The following example Python program implements the search algorithm. def B_tree_search(tr, key): while True: for i in range(len(tr.keys)): if key tr.keys[i]: break if key == tr.keys[i]: return (tr, i) if tr.leaf: return None else: if key tr.keys[-1]: i=i+1 tr = tr.children[i] The search algorithm can also be realized by recursion. When search key k in B-tree T = (K;C; t), we partition the keys with k. K1 = fk0jk0 kg K2 = fk0jk k0g Thus K1 contains all the keys less than k, and K2 holds the rest. If the

794. rst element in K2 is equal to k, we

795. nd the key. Otherwise, we recursively search the key in child cjK1j+1. search(T; k) = 8 : (T; jK1j + 1) : k 2 K2 : C = search(cjK1j+1; k) : otherwise (7.16) Below example Haskell program implements this algorithm. search :: (Ord a)) BTree a ! a ! Maybe (BTree a, Int) search tr@(Node ks cs _) k j matchFirst k $ drop len ks = Just (tr, len) j otherwise = if null cs then Nothing else search (cs !! len) k where matchFirst x (y:_) = x==y matchFirst x _ = False len = length $ filter (k) ks

796. 7.5. NOTES AND SHORT SUMMARY 189 7.5 Notes and short summary In this chapter, we explained the B-tree data structure as a kind of extension from binary search tree. The background knowledge of magnetic disk access is skipped, user can refer to [2] for detail. For the three main operations, insertion, deletion, and searching, both imperative and functional algorithms are given. They traverse from the root to the leaf. All the three operations perform in time proportion to the height of the tree. Because B-tree always maintains the balance properties. The performance is ensured to bound to O(lg n) time, where n is the number of the keys in B-tree. Exercise 7.1 Eliminate the recursion in imperative B-tree insertion algorithm.

797. 190 CHAPTER 7. B-TREES

798. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. [2] B-tree, Wikipedia. http://en.wikipedia.org/wiki/B-tree [3] Chris Okasaki. FUNCTIONAL PEARLS Red-Black Trees in a Functional Setting. J. Functional Programming. 1998 191

799. 192 BIBLIOGRAPHY

800. Part III Heaps 193

802. Chapter 8 Binary Heaps 8.1 Introduction Heap is one of the elementary data structure. It is widely used to solve some practical problems, such as sorting, prioritized scheduling, and graph algorithms[2]. Most popular implementations of heap are using a kind of implicit binary heap by array, which is described in [2]. Examples include C++/STL heap and Python heapq. The most ecient heap sort algorithm is also realized with binary heap as proposed by R. W. Floyd [3] [5]. However, heaps can be general and realized with varies of other data struc-tures besides array. In this chapter, explicit binary tree is used. It leads to Leftist heaps, Skew heaps, and Splay heaps, which are suitable for purely func-tional implementation as shown by Okasaki[6]. A heap is a data structure that satis

803. es the following heap property. Top operation always returns the minimum (maximum) element; Pop operation removes the top element from the heap while the heap property should be kept, so that the new top element is still the minimum (maximum) one; Insert a new element to heap should keep the heap property. That the new top is still the minimum (maximum) element; Other operations including merge etc should all keep the heap property. This is a kind of recursive de

804. nition, while it doesn't limit the under ground data structure. We call the heap with the minimum element on top as min-heap, while if the top keeps the maximum element, we call it max-heap. 8.2 Implicit binary heap by array Considering the heap de

805. nition in previous section, one option to implement heap is by using trees. A straightforward solution is to store the minimum (maximum) element in the root of the tree, so for `top' operation, we simply 195

806. 196 CHAPTER 8. BINARY HEAPS return the root as the result. And for `pop' operation, we can remove the root and rebuild the tree from the children. If binary tree is used to implement heap the heap, we can call it binary heap. This chapter explains three dierent realizations for binary heap. 8.2.1 De

807. nition The

808. rst one is implicit binary tree. Consider the problem how to represent a complete binary tree with array. (For example, try to represent a complete binary tree in the programming language doesn't support structure or record data type. Only array can be used). One solution is to pack all elements from top level (root) down to bottom level (leaves). Figure 8.1 shows a complete binary tree and its corresponding array repre-sentation. 1 6 1 4 1 0 8 7 2 4 1 9 3 1 6 1 4 1 0 8 7 9 3 2 4 1 Figure 8.1: Mapping between a complete binary tree and array This mapping between tree and array can be de

809. ned as the following equa-tions (The array index starts from 1). 1: function PARENT(i) 2: return b i 2 c 3: function LEFT(i) 4: return 2i 5: function LEFT(i) 6: return 2i + 1 For a given tree node which is represented as the i-th element of the array, since the tree is complete, we can easily

810. nd its parent node as the bi=2c-th

811. 8.2. IMPLICIT BINARY HEAP BY ARRAY 197 element; Its left child with index of 2i and right child of 2i + 1. If the index of the child exceeds the length of the array, it means this node does not have such a child (leaf for example). In real implementation, this mapping can be calculated fast with bit-wise operation like the following example ANSI C code. Note that, the array index starts from zero in C like languages. #define PARENT(i) ((((i) + 1) 1) - 1) #define LEFT(i) (((i) 1) + 1) #define RIGHT(i) (((i) + 1) 1) 8.2.2 Heapify The most important thing for heap algorithm is to maintain the heap property, that the top element should be the minimum (maximum) one. For the implicit binary heap by array, it means for a given node, which is represented as the i-th index, we need develop a method to check if both its two children conform to this property. In case there is violation, we need swap the parent and child recursively [2] to

812. x the problem. Below algorithm shows the iterative solution to enforce the min-heap prop-erty from the given index of the array. 1: function Heapify(A; i) 2: n jAj 3: loop 4: l Left(i) 5: r Right(i) 6: smallest i 7: if l n ^ A[l] A[i] then 8: smallest l 9: if r n ^ A[r] A[smallest] then 10: smallest r 11: if smallest6= i then 12: Exchange A[i] $ A[smallest] 13: i smallest 14: else 15: return For array A and the given index i, None its children should be bigger than A[i], in case there is violation, we pick the smallest element as A[i], and swap the previous A[i] to child. The algorithm traverses the tree top-down to

813. x the heap property until either reach a leaf or there is no heap property violation. The Heapify algorithm takes O(lg n) time, where n is the number of ele-ments. This is because the loop time is proportion to the height of the binary tree. When implement this algorithm, the comparison method can be passed as a parameter, so that both min-heap and max-heap can be supported. The following ANSI C example code uses this approach. typedef int (Less)(Key, Key); int less(Key x, Key y) { return x y; }

814. 198 CHAPTER 8. BINARY HEAPS int notless(Key x, Key y) { return !less(x, y); } void heapify(Key a, int i, int n, Less lt) { int l, r, m; while (1) { l = LEFT(i); r = RIGHT(i); m = i; if (l n lt(a[l], a[i])) m = l; if (r n lt(a[r], a[m])) m = r; if (m != i) { swap(a, i, m); i = m; } else break; } } Figure 8.2 illustrates the steps when Heapify processing the array f16; 4; 10; 14; 7; 9; 3; 2; 8; 1g. The array changes to f16; 14; 10; 8; 7; 9; 3; 2; 4; 1g. 8.2.3 Build a heap With Heapify algorithm de

815. ned, it is easy to build a heap from the arbitrary array. Observe that the numbers of nodes in a complete binary tree for each level is a list like below: 1; 2; 4; 8; :::; 2i; :::. The only exception is the last level. Since the tree may not full (note that complete binary tree doesn't mean full binary tree), the last level contains at most 2p1 nodes, where 2p n and n is the length of the array. The Heapify algorithm doesn't have any eect on leave node. We can skip applying Heapify for all leaves. In other words, all leaf nodes have already satis

816. ed the heap property. We only need start checking and maintain the heap property from the last branch node. the index of the last branch node is no greater than bn=2c. Based on this fact, we can build a heap with the following algorithm. (As-sume the heap is min-heap). 1: function Build-Heap(A) 2: n jAj 3: for i bn=2c down to 1 do 4: Heapify(A; i) Although the complexity of Heapify is O(lg n), the running time of Build- Heap is not bound to O(n lg n) but O(n). This is a linear time algorithm. [2] provides the detailed proof. Below ANSI C example program implements this heap building function. void build_heap(Key a, int n, Less lt) { int i; for (i = (n-1) 1; i 0; --i)

817. 8.2. IMPLICIT BINARY HEAP BY ARRAY 199 1 6 4 1 0 1 4 7 2 8 1 9 3 a. Step 1, 14 is the biggest element among 4, 14, and 7. Swap 4 with the left child; 1 6 1 4 1 0 4 7 2 8 1 9 3 b. Step 2, 8 is the biggest element among 2, 4, and 8. Swap 4 with the right child; 1 6 1 4 1 0 8 7 2 4 1 9 3 c. 4 is the leaf node. It hasn't any children. Process terminates. Figure 8.2: Heapify example, a max-heap case.

818. 200 CHAPTER 8. BINARY HEAPS heapify(a, i, n, lt); } Figure 8.3 and 8.4 show the steps when building a heap from array f4; 1; 3; 2; 16; 9; 10; 14; 8; 7g. The node in black color is the one Heapify being applied, the nodes in gray color are swapped during to keep the heap property. 4 1 3 2 1 6 9 1 0 1 4 8 7 a. An array in arbitrary order before heap building process; 4 1 3 2 1 6 1 4 8 7 9 1 0 b. Step 1, The array is mapped to binary tree. The

819. rst branch node, which is 16 is examined; 4 1 3 2 1 6 1 4 8 7 9 1 0 c. Step 2, 16 is the largest element in current sub tree, next is to check node with value 2; Figure 8.3: Build a heap from the arbitrary array. Gray nodes are changed in each step, black node will be processed next step. 8.2.4 Basic heap operations From the generic de

820. nition of heap (not necessarily the binary heap), It's es-sential to provides basic operations so that user can access the data and modify it. The most important operations include accessing the top element (

821. nd the minimum or maximum one), popping the top element from the heap,

822. nding the top n elements, decreasing a key (this is for min-heap, and it is increasing a key for max-heap), and insertion. For the binary tree, most of operations are bound to O(lg n) in worst-case, some of them, such as top is O(1) constant time.

823. 8.2. IMPLICIT BINARY HEAP BY ARRAY 201 4 1 3 1 4 1 6 2 8 7 9 1 0 d. Step 3, 14 is the largest value in the sub-tree, swap 14 and 2; next is to check node with value 3; 4 1 1 0 1 4 1 6 2 8 7 9 3 e. Step 4, 10 is the largest value in the sub-tree, swap 10 and 3; next is to check node with value 1; 4 1 6 1 0 1 4 7 2 8 1 9 3 f. Step 5, 16 is the largest value in current node, swap 16 and 1

824. rst; then similarly, swap 1 and 7; next is to check the root node with value 4; 1 6 1 4 1 0 8 7 2 4 1 9 3 g. Step 6, Swap 4 and 16, then swap 4 and 14, and then swap 4 and 8; And the whole build process

825. nish. Figure 8.4: Build a heap from the arbitrary array. Gray nodes are changed in each step, black node will be processed next step.

826. 202 CHAPTER 8. BINARY HEAPS Access the top element (the minimum) We need provide a way to access the top element eciently. For the binary tree realization, it is the root stores the minimum (maximum) value. This is the

827. rst element in the array. 1: function TOP(A) 2: return A[1] This operation is trivial. It takes O(1) time. Here we skip the error handling for empty case. If the heap is empty, one option is to raise an error. Heap Pop Pop operation is more complex than accessing the top, because the heap prop-erty has to be maintained after the top element is removed. The solution is to apply Heapify algorithm to the next element after the root is removed. One simple but slow method based on this idea looks like the following. 1: function Pop-Slow(A) 2: x Top(A) 3: Remove(A, 1) 4: if A is not empty then 5: Heapify(A, 1) 6: return x This algorithm

828. rstly records the top element in x, then it removes the

829. rst element from the array, the size of this array is reduced by one. After that if the array isn't empty, Heapify will applied to the new array from the

830. rst element (It was previously the second one). Removing the

831. rst element from array takes O(n) time, where n is the length of the array. This is because we need shift all the rest elements one by one. This bottle neck slows the whole algorithm to linear time. In order to solve this problem, one alternative is to swap the

832. rst element with the last one in the array, then shrink the array size by one. 1: function Pop(A) 2: x Top(A) 3: n Heap-Size(A) 4: Exchange A[1] $ A[n] 5: Remove(A; n) 6: if A is not empty then 7: Heapify(A, 1) 8: return x Removing the last element from the array takes only constant O(1) time, and Heapify is bound to O(lg n). Thus the whole algorithm performs in O(lg n) time. The following example ANSI C program implements this algorithm1. Key pop(Key a, int n, Less lt) { swap(a, 0, --n); heapify(a, 0, n, lt); 1This program does not actually remove the last element, it reuse the last cell to store the popped result

833. 8.2. IMPLICIT BINARY HEAP BY ARRAY 203 return a[n]; } Since the top element is removed from the array, instead swapping, this program overwrites it with the last element before applying Heapify. Find the top k elements With pop de

834. ned, it is easy to

835. nd the top k elements from array. we can build a max-heap from the array, then perform pop operation k times. 1: function Top-k(A; k) 2: R 3: Build-Heap(A) 4: for i 1 to Min(k, Length(A)) do 5: Append(R, Pop(A)) 6: return R If k is greater than the length of the array, we need return the whole array as the result. That's why it calls the Min function to determine the number of loops. Below example Python program implements the top-k algorithm. def top_k(x, k, less_p = MIN_HEAP): build_heap(x, less_p) return [heap_pop(x, less_p) for _ in range(min(k, len(x)))] Decrease key Heap can be used to implement priority queue. It is important to support key modi

836. cation in heap. One typical operation is to increase the priority of a tasks so that it can be performed earlier. Here we present the decrease key operation for a min-heap. The correspond-ing operation is increase key for max-heap. Figure 8.5 illustrate such a case for a max-heap. The key of the 9-th node is increased from 4 to 15. Once a key is decreased in a min-heap, it may make the node con ict with the heap property, that the key may be less than some ancestor. In order to maintain the invariant, the following auxiliary algorithm is de

837. ned to resume the heap property. 1: function Heap-Fix(A; i) 2: while i 1 ^ A[i] A[ Parent(i) ] do 3: Exchange A[i] $ A[ Parent(i) ] 4: i Parent(i) This algorithm repeatedly compares the keys of parent node and current node. It swap the nodes if the parent contains the smaller key. This process is performed from the current node towards the root node till it

838. nds that the parent node holds the smaller key. With this auxiliary algorithm, decrease key can be realized as below. 1: function Decrease-Key(A; i; k) 2: if k A[i] then 3: A[i] k 4: Heap-Fix(A; i)

839. 204 CHAPTER 8. BINARY HEAPS 1 6 1 4 1 0 8 7 2 4 1 9 3 a. The 9-th node with key 4 will be modi

840. ed; 1 6 1 4 1 0 8 7 2 1 5 1 9 3 b. The key is modi

841. ed to 15, which is greater than its parent; 1 6 1 4 1 0 1 5 7 2 8 1 9 3 c. According the max-heap property, 8 and 15 are swapped. 1 6 1 5 1 0 1 4 7 2 8 1 9 3 d. Since 15 is greater than its parent 14, they are swapped. After that, because 15 is less than 16, the process terminates. Figure 8.5: Example process when increase a key in a max-heap.

842. 8.2. IMPLICIT BINARY HEAP BY ARRAY 205 This algorithm is only triggered when the new key is less than the original key. The performance is bound to O(lg n). Below example ANSI C program implements the algorithm. void heap_fix(Key a, int i, Less lt) { while (i 0 lt(a[i], a[PARENT(i)])) { swap(a, i, PARENT(i)); i = PARENT(i); } } void decrease_key(Key a, int i, Key k, Less lt) { if (lt(k, a[i])) { a[i] = k; heap_fix(a, i, lt); } } Insertion In some materials like [2], insertion is implemented by using Decrease-Key. A new node with 1 as key is created. According to the min-heap property, it should be the last element in the under ground array. After that, the key is decreased to the value to be inserted, and Decrease-Key is called to

843. x any violation to the heap property. Alternatively, we can reuse Heap-Fix to implement insertion. The new key is directly appended at the end of the array, and the Heap-Fix is applied to this new node. 1: function Heap-Push(A; k) 2: Append(A; k) 3: Heap-Fix(A; jAj) The following example Python program implements the heap insertion algo-rithm. def heap_insert(x, key, less_p = MIN_HEAP): i = len(x) x.append(key) heap_fix(x, i, less_p) 8.2.5 Heap sort Heap sort is interesting application of heap. According to the heap property, the min(max) element can be easily accessed by from the top of the heap. A straightforward way to sort a list of values is to build a heap from them, then continuously pop the smallest element till the heap is empty. The algorithm based on this idea can be de

844. ned like below. 1: function Heap-Sort(A) 2: R 3: Build-Heap(A) 4: while A6= do 5: Append(R, Heap-Pop(A))

845. 206 CHAPTER 8. BINARY HEAPS 6: return R The following Python example program implements this de

846. nition. def heap_sort(x, less_p = MIN_HEAP): res = [] build_heap(x, less_p) while x!=[]: res.append(heap_pop(x, less_p)) return res When sort n elements, the Build-Heap is bound to O(n). Since pop is O(lg n), and it is called n times, so the overall sorting takes O(n lg n) time to run. Because we use another list to hold the result, the space requirement is O(n). Robert. W. Floyd found a fast implementation of heap sort. The idea is to build a max-heap instead of min-heap, so the

847. rst element is the biggest one. Then the biggest element is swapped with the last element in the array, so that it is in the right position after sorting. As the last element becomes the new top, it may violate the heap property. We can shrink the heap size by one and perform Heapify to resume the heap property. This process is repeated till there is only one element left in the heap. 1: function Heap-Sort(A) 2: Build-Max-Heap(A) 3: while jAj 1 do 4: Exchange A[1] $ A[n] 5: jAj jAj 1 6: Heapify(A; 1) This is in-place algorithm, it needn't any extra spaces to hold the result. The following. The following ANSI C example code implements this algorithm. void heap_sort(Key a, int n) { build_heap(a, n, notless); while(n 1) { swap(a, 0, --n); heapify(a, 0, n, notless); } } Exercise 8.1 Somebody considers one alternative to realize in-place heap sort. Take sorting the array in ascending order as example, the

848. rst step is to build the array as a minimum heap A, but not the maximum heap like the Floyd's method. After that the

849. rst element a1 is in the correct place. Next, treat the rest fa2; a3; :::; ang as a new heap, and perform Heapify to them from a2 for these n 1 elements. Repeating this advance and Heapify step from left to right would sort the array. The following example ANSI C code illustrates this idea. Is this solution correct? If yes, prove it; if not, why? void heap_sort(Key a, int n) { build_heap(a, n, less); while(--n)

850. 8.3. LEFTIST HEAP AND SKEW HEAP, THE EXPLICIT BINARY HEAPS207 heapify(++a, 0, n, less); } Because of the same reason, can we perform Heapify from left to right k times to realize in-place top-k algorithm like below ANSI C code? int tops(int k, Key a, int n, Less lt) { build_heap(a, n, lt); for (k = MIN(k, n) - 1; k; --k) heapify(++a, 0, --n, lt); return k; } 8.3 Leftist heap and Skew heap, the explicit binary heaps Instead of using implicit binary tree by array, it is natural to consider why we can't use explicit binary tree to realize heap? There are some problems must be solved if we turn into explicit binary tree as the under ground data structure. The

851. rst problem is about the Heap-Pop or Delete-Min operation. Con-sider the binary tree is represented in form of left, key, and right as (L; k;R), which is shown in

852. gure 8.6 k L R Figure 8.6: A binary tree, all elements in children are smaller than k. If k is the top element, all elements in left and right children are less than k. After k is popped, only left and right children are left. They have to be merged to a new tree. Since heap property should be maintained after merge, the new root is still the smallest element. Because both left and right children are binary trees conforming heap prop-erty, the two trivial cases can be de

853. ned immediately. merge(H1;H2) = 8 : H2 : H1 = H1 : H2 = ? : otherwise

854. 208 CHAPTER 8. BINARY HEAPS Where means empty heap. If neither left nor right child is empty, because they all

855. t heap property, the top elements of them are all the minimum respectively. We can compare these two roots, and select the smaller as the new root of the merged heap. For instance, let L = (A; x;B) and R = (A0; y;B0), where A, A0, B, and B0 are all sub trees. If x y, x will be the new root. We can either keep A, and recursively merge B and R; or keep B, and merge A and R, so the new heap can be one of the following. (merge(A;R); x;B) (A; x; merge(B;R)) Both are correct. One simpli

856. ed solution is to only merge the right sub tree. Leftist tree provides a systematically approach based on this idea. 8.3.1 De

857. nition The heap implemented by Leftist tree is called Leftist heap. Leftist tree is

858. rst introduced by C. A. Crane in 1972[6]. Rank (S-value) In Leftist tree, a rank value (or S value) is de

859. ned for each node. Rank is the distance to the nearest external node. Where external node is the NIL concept extended from the leaf node. For example, in

860. gure 8.7, the rank of NIL is de

861. ned 0, consider the root node 4, The nearest leaf node is the child of node 8. So the rank of root node 4 is 2. Because node 6 and node 8 both only contain NIL, so their rank values are 1. Although node 5 has non-NIL left child, However, since the right child is NIL, so the rank value, which is the minimum distance to leaf is still 1. 4 5 8 6 NIL NIL NIL NIL NIL Figure 8.7: rank(4) = 2, rank(6) = rank(8) = rank(5) = 1.

862. 8.3. LEFTIST HEAP AND SKEW HEAP, THE EXPLICIT BINARY HEAPS209 Leftist property With rank de

863. ned, we can create a strategy when merging. Every time when merging, we always merge to right child; Denote the rank of the new right sub tree as rr; Compare the ranks of the left and right children, if the rank of left sub tree is rl and rl rr, we swap the left and the right children. We call this `Leftist property'. In general, a Leftist tree always has the shortest path to some external node on the right. Leftist tree tends to be very unbalanced, However, it ensures important property as speci

864. ed in the following theorem. Theorem 8.3.1. If a Leftist tree T contains n internal nodes, the path from root to the rightmost external node contains at most blog(n + 1)c nodes. We skip the proof here, readers can refer to [7] and [1] for more information. With this theorem, algorithms operate along this path are all bound to O(lg n). We can reuse the binary tree de

865. nition, and augment with a rank

866. eld to de

867. ne the Leftist tree. For example in form of (r; k; L;R) for non-empty case. The below Haskell de

868. nes Leftist tree. data LHeap a = E -- Empty j Node Int a (LHeap a) (LHeap a) -- rank, element, left, right For empty tree, the rank is de

869. ned as zero. Otherwise, it's the value of the augmented

870. eld. A rank(H) function can be given to cover both cases. rank(H) = 0 : H = r : otherwise;H = (r; k; L;R) (8.1) Here is the example Haskell rank function. rank E = 0 rank (Node r _ _ _) = r In the rest of this section, we denote rank(H) as rH 8.3.2 Merge In order to realize `merge', we need develop the auxiliary algorithm to compare the ranks and swap the children if necessary. mk(k; A;B) = (rA + 1; k;B;A) : rA rB (rB + 1; k; A;B) : otherwise (8.2) This function takes three arguments, a key and two sub trees A, and B. if the rank of A is smaller, it builds a bigger tree with B as the left child, and A as the right child. It increment the rank of A by 1 as the rank of the new tree; Otherwise if B holds the smaller rank, then A is set as the left child, and B becomes the right. The resulting rank is rb + 1. The reason why rank need be increased by one is because there is a new key added on top of the tree. It causes the rank increasing.

871. 210 CHAPTER 8. BINARY HEAPS Denote the key, the left and right children for H1 and H2 as k1;L1;R1, and k2;L2;R2 respectively. The merge(H1;H2) function can be completed by using this auxiliary tool as below merge(H1;H2) = 8 : H2 : H1 = H1 : H2 = mk(k1;L1; merge(R1;H2)) : k1 k2 mk(k2;L2; merge(H1;R2)) : otherwise (8.3) The merge function is always recursively called on the right side, and the Leftist property is maintained. These facts ensure the performance being bound to O(lg n). The following Haskell example code implements the merge program. merge E h = h merge h E = h merge h1@(Node _ x l r) h2@(Node _ y l' r') = if x y then makeNode x l (merge r h2) else makeNode y l' (merge h1 r') makeNode x a b = if rank a rank b then Node (rank a + 1) x b a else Node (rank b + 1) x a b Merge operation in implicit binary heap by array Implicit binary heap by array performs very fast in most cases, and it

872. ts modern computer with cache technology well. However, merge is the algorithm bounds to O(n) time. The typical realization is to concatenate two arrays together and make a heap for this array [13]. 1: function Merge-Heap(A;B) 2: C Concat(A;B) 3: Build-Heap(C) 8.3.3 Basic heap operations Most of the basic heap operations can be implemented with merge algorithm de

873. ned above. Top and pop Because the smallest element is always held in root, it's trivial to

874. nd the minimum value. It's constant O(1) operation. Below equation extracts the root from non-empty heap H = (r; k; L;R). The error handling for empty case is skipped here. top(H) = k (8.4) For pop operation,

875. rstly, the top element is removed, then left and right children are merged to a new heap. pop(H) = merge(L;R) (8.5)

876. 8.3. LEFTIST HEAP AND SKEW HEAP, THE EXPLICIT BINARY HEAPS211 Because it calls merge directly, the pop operation on Leftist heap is bound to O(lg n). Insertion To insert a new element, one solution is to create a single leaf node with the element, and then merge this leaf node to the existing Leftist tree. insert(H; k) = merge(H; (1; k;; )) (8.6) It is O(lg n) algorithm since insertion also calls merge directly. There is a convenient way to build the Leftist heap from a list. We can continuously insert the elements one by one to the empty heap. This can be realized by folding. build(L) = fold(insert;;L) (8.7) Figure 8.8 shows one example Leftist tree built in this way. 1 2 4 3 7 9 1 6 1 0 1 4 8 Figure 8.8: A Leftist tree built from list f9; 4; 16; 7; 10; 2; 14; 3; 8; 1g. The following example Haskell code gives reference implementation for the Leftist tree operations. insert h x = merge (Node 1 x E E) h findMin (Node _ x _ _) = x deleteMin (Node _ _ l r) = merge l r fromList = foldl insert E

877. 212 CHAPTER 8. BINARY HEAPS 8.3.4 Heap sort by Leftist Heap With all the basic operations de

878. ned, it's straightforward to implement heap sort. We can

879. rstly turn the list into a Leftist heap, then continuously extract the minimum element from it. sort(L) = heapSort(build(L)) (8.8) heapSort(H) = : H = ftop(H)g [ heapSort(pop(H)) : otherwise (8.9) Because pop is logarithm operation, and it is recursively called n times, this algorithm takes O(n lg n) time in total. The following Haskell example program implements heap sort with Leftist tree. heapSort = hsort fromList where hsort E = [] hsort h = (findMin h):(hsort $ deleteMin h) 8.3.5 Skew heaps Leftist heap performs poor in some cases. Figure 8.9 shows one example. The Leftist tree is built by folding on list f16; 14; 10; 8; 7; 9; 3; 2; 4; 1g. 1 2 3 4 7 8 9 1 0 1 4 1 6 Figure 8.9: A very unbalanced Leftist tree build from list f16; 14; 10; 8; 7; 9; 3; 2; 4; 1g. The binary tree almost turns to a linked-list. The worst case happens when feed the ordered list to build Leftist tree. Because the tree downgrades to linked-list, the performance drops from O(lg n) to O(n). Skew heap (or self-adjusting heap) simpli

880. es Leftist heap realization and can solve the performance issue[9] [10].

881. 8.3. LEFTIST HEAP AND SKEW HEAP, THE EXPLICIT BINARY HEAPS213 When construct the Leftist heap, we swap the left and right children during merge if the rank on left side is less than the right side. This comparison-and-swap strategy doesn't work when either sub tree has only one child. Because in such case, the rank of the sub tree is always 1 no matter how big it is. A `Brute-force' approach is to swap the left and right children every time when merge. This idea leads to Skew heap. De

882. nition of Skew heap Skew heap is the heap realized with Skew tree. Skew tree is a special binary tree. The minimum element is stored in root. Every sub tree is also a skew tree. It needn't keep the rank (or S-value)

883. eld. We can reuse the binary tree de

884. nition for Skew heap. The tree is either empty, or in a pre-order form (k; L;R). Below Haskell code de

885. nes Skew heap like this. data SHeap a = E -- Empty j Node a (SHeap a) (SHeap a) -- element, left, right Merge The merge algorithm tends to be very simple. When merge two non-empty Skew trees, we compare the roots, and pick the smaller one as the new root, then the other tree contains the bigger element is merged onto one sub tree,

886. nally, the tow children are swapped. Denote H1 = (k1;L1;R1) and H2 = (k2;L2;R2) if they are not empty. if k1 k2 for instance, select k1 as the new root. We can either merge H2 to L1, or merge H2 to R1. Without loss of generality, let's merge to R1. And after swapping the two children, the

887. nal result is (k1; merge(R1;H2);L1). Take account of edge cases, the merge algorithm is de

888. ned as the following. merge(H1;H2) = 8 : H1 : H2 = H2 : H1 = (k1; merge(R1;H2);L1) : k1 k2 (k2; merge(H1;R2);L2) : otherwise (8.10) All the rest operations, including insert, top and pop are all realized as same as the Leftist heap by using merge, except that we needn't the rank any more. Translating the above algorithm into Haskell yields the following example program. merge E h = h merge h E = h merge h1@(Node x l r) h2@(Node y l' r') = if x y then Node x (merge r h2) l else Node y (merge h1 r') l' insert h x = merge (Node x E E) h findMin (Node x _ _) = x deleteMin (Node _ l r) = merge l r

889. 214 CHAPTER 8. BINARY HEAPS Dierent from the Leftist heap, if we feed ordered list to Skew heap, it can build a fairly balanced binary tree as illustrated in

890. gure 8.10. 1 2 4 3 7 9 1 6 1 0 1 4 8 Figure 8.10: Skew tree is still balanced even the input is an ordered list f1; 2; :::; 10g. 8.4 Splay heap The Leftist heap and Skew heap show the fact that it's quite possible to realize heap data structure with explicit binary tree. Skew heap gives one method to solve the tree balance problem. Splay heap on the other hand, use another method to keep the tree balanced. The binary trees used in Leftist heap and Skew heap are not Binary Search tree (BST). If we turn the underground data structure to binary search tree, the minimum(or maximum) element is not root any more. It takes O(lg n) time to

891. nd the minimum(or maximum) element. Binary search tree becomes inecient if it isn't well balanced. Most op-erations degrade to O(n) in the worst case. Although red-black tree can be used to realize binary heap, it's overkill. Splay tree provides a light weight implementation with acceptable dynamic balancing result. 8.4.1 De

892. nition Splay tree uses cache-like approach. It keeps rotating the current access node close to the top, so that the node can be accessed fast next time. It de

893. nes such kinds of operation as Splay. For the unbalanced binary search tree, after several splay operations, the tree tends to be more and more balanced. Most basic operations of Splay tree perform in amortized O(lg n) time. Splay tree

894. 8.4. SPLAY HEAP 215 was invented by Daniel Dominic Sleator and Robert Endre Tarjan in 1985[11] [12]. Splaying There are two methods to do splaying. The

895. rst one need deal with many dierent cases, but can be implemented fairly easy with pattern matching. The second one has a uniformed form, but the implementation is complex. Denote the node currently being accessed as X, the parent node as P, and the grand parent node as G (If there are). There are 3 steps for splaying. Each step contains 2 symmetric cases. For illustration purpose, only one case is shown for each step. Zig-zig step. As shown in

896. gure 8.11, in this case, X and P are children on the same side, either both on the left or on the right. By rotating 2 times, X becomes the new root. G P d X c a b (a) X and P are either left or right children. X a p b g c d (b) X becomes new root after rotating 2 times. Figure 8.11: Zig-zig case. Zig-zag step. As shown in

897. gure 8.12, in this case, X and P are children on dierent sides. X is on the left, P is on the right. Or X is on the right, P is on the left. After rotation, X becomes the new root, P and G are siblings. Zig step. As shown in

898. gure 8.13, in this case, P is the root, we rotate the tree, so that X becomes new root. This is the last step in splay operation. Although there are 6 dierent cases, they can be handled in the environments support pattern matching. Denote the binary tree in form (L; k;R). When

899. 216 CHAPTER 8. BINARY HEAPS G P d a X b c (a) X and P are children on dierent sides. X P G a b c d (b) X becomes new root. P and G are siblings. Figure 8.12: Zig-zag case. P X c a b (a) P is the root. X a P b c (b) Rotate the tree to make X be new root. Figure 8.13: Zig case.

900. 8.4. SPLAY HEAP 217 access key Y in tree T, the splay operation can be de

901. ned as below. splay(T;X) = 8 : (a;X; (b; P; (c; G; d))) : T = (((a;X; b); P; c); G; d);X = Y (((a; G; b); P; c);X; d) : T = (a; G; (b; P; (c;X; d)));X = Y ((a; P; b);X; (c; G; d)) : T = (a; P; (b;X; c); G; d);X = Y ((a; G; b);X; (c; P; d)) : T = (a; G; ((b;X; c); P; d));X = Y (a;X; (b; P; c)) : T = ((a;X; b); P; c);X = Y ((a; P; b);X; c) : T = (a; P; (b;X; c));X = Y T : otherwise (8.11) The

902. rst two clauses handle the 'zig-zig' cases; the next two clauses handle the 'zig-zag' cases; the last two clauses handle the zig cases. The tree aren't changed for all other situations. The following Haskell program implements this splay function. data STree a = E -- Empty j Node (STree a) a (STree a) -- left, key, right -- zig-zig splay t@(Node (Node (Node a x b) p c) g d) y = if x == y then Node a x (Node b p (Node c g d)) else t splay t@(Node a g (Node b p (Node c x d))) y = if x == y then Node (Node (Node a g b) p c) x d else t -- zig-zag splay t@(Node (Node a p (Node b x c)) g d) y = if x == y then Node (Node a p b) x (Node c g d) else t splay t@(Node a g (Node (Node b x c) p d)) y = if x == y then Node (Node a g b) x (Node c p d) else t -- zig splay t@(Node (Node a x b) p c) y = if x == y then Node a x (Node b p c) else t splay t@(Node a p (Node b x c)) y = if x == y then Node (Node a p b) x c else t -- otherwise splay t _ = t With splay operation de

903. ned, every time when insert a new key, we call the splay function to adjust the tree. If the tree is empty, the result is a leaf; otherwise we compare this key with the root, if it is less than the root, we recursively insert it into the left child, and perform splaying after that; else the key is inserted into the right child. insert(T; x) = 8 : (; x; ) : T = splay((insert(L; x); k;R); x) : T = (L; k;R); x k splay(L; k; insert(R; x)) : otherwise (8.12) The following Haskell program implements this insertion algorithm. insert E y = Node E y E insert (Node l x r) y j x y = splay (Node (insert l y) x r) y j otherwise = splay (Node l x (insert r y)) y Figure 8.14 shows the result of using this function. It inserts the ordered elements f1; 2; :::; 10g one by one to the empty tree. This would build a very

904. 218 CHAPTER 8. BINARY HEAPS poor result which downgrade to linked-list with normal binary search tree. The splay method creates more balanced result. 5 4 1 0 2 1 3 9 7 6 8 Figure 8.14: Splaying helps improving the balance. Okasaki found a simple rule for Splaying [6]. Whenever we follow two left branches, or two right branches continuously, we rotate the two nodes. Based on this rule, splaying can be realized in such a way. When we access node for a key x (can be during the process of inserting a node, or looking up a node, or deleting a node), if we traverse two left branches or two right branches, we partition the tree in two parts L and R, where L contains all nodes smaller than x, and R contains all the rest. We can then create a new tree (for instance in insertion), with x as the root, L as the left child, and R being the right child.

905. 8.4. SPLAY HEAP 219 The partition process is recursive, because it will splay its children as well. partition(T; p) = 8 : (; ) : T = (T; ) : T = (L; k;R) ^ R = ((L; k;L0); k0; A;B) : T = (L; k; (L0; k0;R0)) k p; k0 p (A;B) = partition(R0; p) ((L; k;A); (B; k0;R0)) : T = (L;K; (L0; k0;R0)) k p k0 (A;B) = partition(L0; p) (; T) : T = (L; k;R) ^ L = (A; (L0; k0; (R0; k;R)) : T = ((L0; k0;R0); k;R) p k; p k0 (A;B) = partition(L0; p) ((L0; k0;A); (B; k;R)) : T = ((L0; k0;R0); k;R) k0 p k (A;B) = partition(R0; p) (8.13) Function partition(T; p) takes a tree T, and a pivot p as arguments. The

906. rst clause is edge case. The partition result for empty is a pair of empty left and right trees. Otherwise, denote the tree as (L; k;R). we need compare the pivot p and the root k. If k p, there are two sub-cases. one is trivial case that R is empty. According to the property of binary search tree, All elements are less than p, so the result pair is (T; ); For the other case, R = (L0; k0;R0), we need further compare k0 with the pivot p. If k0 p is also true, we recursively partition R0 with the pivot, all the elements less than p in R0 is held in tree A, and the rest is in tree B. The result pair can be composed with two trees, one is ((L; k;L0); k0;A); the other is B. If the key of the right sub tree is not less than the pivot. we recursively partition L0 with the pivot to give the intermediate pair (A;B), the

907. nal pair trees can be composed with (L; k;A) and (B; k0;R0). There is a symmetric cases for p k. They are handled in the last three clauses. Translating the above algorithm into Haskell yields the following partition program. partition E _ = (E, E) partition t@(Node l x r) y j x y = case r of E ! (t, E) Node l' x' r' ! if x' y then let (small, big) = partition r' y in (Node (Node l x l') x' small, big) else let (small, big) = partition l' y in (Node l x small, Node big x' r')

908. 220 CHAPTER 8. BINARY HEAPS j otherwise = case l of E ! (E, t) Node l' x' r' ! if y x' then let (small, big) = partition l' y in (small, Node l' x' (Node r' x r)) else let (small, big) = partition r' y in (Node l' x' small, Node big x r) Alternatively, insertion can be realized with partition algorithm. When insert a new element k into the splay heap T, we can

909. rst partition the heap into two trees, L and R. Where L contains all nodes smaller than k, and R contains the rest. We then construct a new node, with k as the root and L, R as the children. insert(T; k) = (L; k;R); (L;R) = partition(T; k) (8.14) The corresponding Haskell example program is as the following. insert t x = Node small x big where (small, big) = partition t x Top and pop Since splay tree is just a special binary search tree, the minimum element is stored in the left most node. We need keep traversing the left child to realize the top operation. Denote the none empty tree T = (L; k;R), the top(T) function can be de

910. ned as below. top(T) = k : L = top(L) : otherwise (8.15) This is exactly the min(T) algorithm for binary search tree. For pop operation, the algorithm need remove the minimum element from the tree. Whenever there are two left nodes traversed, the splaying operation should be performed. pop(T) = 8 : R : T = (; k;R) (R0; k;R) : T = ((; k0;R0)k;R) (pop(L0); k0; (R0; k;R)) : T = ((L0; k0;R0); k;R) (8.16) Note that the third clause performs splaying without explicitly call the partition function. It utilizes the property of binary search tree directly. Both the top and pop algorithms are bound to O(lg n) time because the splay tree is balanced. The following Haskell example programs implement the top and pop opera-tions. findMin (Node E x _) = x findMin (Node l x _) = findMin l deleteMin (Node E x r) = r deleteMin (Node (Node E x' r') x r) = Node r' x r deleteMin (Node (Node l' x' r') x r) = Node (deleteMin l') x' (Node r' x r)

911. 8.5. NOTES AND SHORT SUMMARY 221 Merge Merge is another basic operation for heaps as it is widely used in Graph al-gorithms. By using the partition algorithm, merge can be realized in O(lg n) time. When merging two splay trees, for non-trivial case, we can take the root of the

912. rst tree as the new root, then partition the second tree with this new root as the pivot. After that we recursively merge the children of the

913. rst tree to the partition result. This algorithm is de

914. ned as the following. merge(T1; T2) = T2 : T1 = (merge(L;A); k; merge(R;B)) : T1 = (L; k;R); (A;B) = partition(T2; k) (8.17) If the

915. rst heap is empty, the result is de

916. nitely the second heap. Otherwise, denote the

917. rst splay heap as (L; k;R), we partition T2 with k as the pivot to yield (A;B), where A contains all the elements in T2 which are less than k, and B holds the rest. We next recursively merge A with L; and merge B with R as the new children for T1. Translating the de

918. nition to Haskell gives the following example program. merge E t = t merge (Node l x r) t = Node (merge l l') x (merge r r') where (l', r') = partition t x 8.4.2 Heap sort Since the internal implementation of the Splay heap is completely transparent to the heap interface, the heap sort algorithm can be reused. It means that the heap sort algorithm is generic no matter what the underground data structure is. 8.5 Notes and short summary In this chapter, we de

919. ne binary heap more general so that as long as the heap property is maintained, all binary representation of data structures can be used to implement binary heap. This de

920. nition doesn't limit to the popular array based binary heap, but also extends to the explicit binary heaps including Leftist heap, Skew heap and Splay heap. The array based binary heap is particularly convenient for the imperative implementation because it intensely uses random index access which can be mapped to a completely binary tree. It's hard to

921. nd directly functional counterpart in this way. However, by using explicit binary tree, functional implementation can be achieved, most of them have O(lg n) worst case performance, and some of them even reach O(1) amortize time. Okasaki in [6] shows detailed analysis of these data structures. In this chapter, only purely functional realization for Leftist heap, Skew heap, and Splay heap are explained, they can all be realized in imperative approaches.

922. 222 CHAPTER 8. BINARY HEAPS It's very natural to extend the concept from binary tree to k-ary (k-way) tree, which leads to other useful heaps such as Binomial heap, Fibonacci heap and pairing heap. They are introduced in the next chapter. Exercise 8.2 Realize the imperative Leftist heap, Skew heap, and Splay heap.

923. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. [2] Heap (data structure), Wikipedia. http://en.wikipedia.org/wiki/Heap (data structure) [3] Heapsort, Wikipedia. http://en.wikipedia.org/wiki/Heapsort [4] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [5] Sorting algorithms/Heapsort. Rosetta Code. http://rosettacode.org/wiki/Sorting algorithms/Heapsort [6] Leftist Tree, Wikipedia. http://en.wikipedia.org/wiki/Leftist tree [7] Bruno R. Preiss. Data Structures and Algorithms with Object-Oriented De-sign Patterns in Java. http://www.brpreiss.com/books/opus5/index.html [8] Donald E. Knuth. The Art of Computer Programming. Volume 3: Sorting and Searching.. Addison-Wesley Professional; 2nd Edition (October 15, 1998). ISBN-13: 978-0201485417. Section 5.2.3 and 6.2.3 [9] Skew heap, Wikipedia. http://en.wikipedia.org/wiki/Skew heap [10] Sleator, Daniel Dominic; Jarjan, Robert Endre. Self-adjusting heaps SIAM Journal on Computing 15(1):52-69. doi:10.1137/0215004 ISSN 00975397 (1986) [11] Splay tree, Wikipedia. http://en.wikipedia.org/wiki/Splay tree [12] Sleator, Daniel D.; Tarjan, Robert E. (1985), Self-Adjusting Binary Search Trees, Journal of the ACM 32(3):652 - 686, doi: 10.1145/3828.3835 [13] NIST, binary heap. http://xw2k.nist.gov/dads//HTML/binaryheap.html 223

924. 224 The evolution of selection sort

925. Chapter 9 From grape to the world cup, the evolution of selection sort 9.1 Introduction We have introduced the `hello world' sorting algorithm, insertion sort. In this short chapter, we explain another straightforward sorting method, selection sort. The basic version of selection sort doesn't perform as good as the divide and conqueror methods, e.g. quick sort and merge sort. We'll use the same ap-proaches in the chapter of insertion sort, to analyze why it's slow, and try to improve it by varies of attempts till reach the best bound of comparison based sorting, O(N lgN), by evolving to heap sort. The idea of selection sort can be illustrated by a real life story. Consider a kid eating a bunch of grapes. There are two types of children according to my observation. One is optimistic type, that the kid always eats the biggest grape he/she can ever

926. nd; the other is pessimistic, that he/she always eats the smallest one. The

927. rst type of kids actually eat the grape in an order that the size decreases monotonically; while the other eat in a increase order. The kid sorts the grapes in order of size in fact, and the method used here is selection sort. Based on this idea, the algorithm of selection sort can be directly described as the following. In order to sort a series of elements: The trivial case, if the series is empty, then we are done, the result is also empty; Otherwise, we

928. nd the smallest element, and append it to the tail of the result; Note that this algorithm sorts the elements in increase order; It's easy to sort in decrease order by picking the biggest element instead; We'll introduce about passing a comparator as a parameter later on. 225

929. 226CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT Figure 9.1: Always picking the smallest grape. This description can be formalized to a equation. sort(A) = : A = fmg [ sort(A0) : otherwise (9.1) Where m is the minimum element among collection A, and A0 is all the rest elements except m: m = min(A) A0 = A fmg We don't limit the data structure of the collection here. Typically, A is an array in imperative environment, and a list (singly linked-list particularly) in functional environment, and it can even be other data struture which will be introduced later. The algorithm can also be given in imperative manner. function Sort(A) X while A6= do x Min(A) A Del(A; x) X Append(X; x) return X Figure 9.2 depicts the process of this algorithm. pick ... sorted elements ... min ... unsorted elements ... Figure 9.2: The left part is sorted data, continuously pick the minimum element in the rest and append it to the result. We just translate the very original idea of `eating grapes' line by line without considering any expense of time and space. This realization stores the result in

930. 9.2. FINDING THE MINIMUM 227 X, and when an selected element is appended to X, we delete the same element from A. This indicates that we can change it to `in-place' sorting to reuse the spaces in A. The idea is to store the minimum element in the

931. rst cell in A (we use term `cell' if A is an array, and say `node' if A is a list); then store the second minimum element in the next cell, then the third cell, ... One solution to realize this sorting strategy is swapping. When we select the i-th minimum element, we swap it with the element in the i-th cell: function Sort(A) for i 1 to jAj do m Min(A[i:::]) Exchange A[i] $ m Denote A = fa1; a2; :::; aNg. At any time, when we process the i-th element, all elements before i, as fa1; a2; :::; ai1g have already been sorted. We locate the minimum element among the fai; ai+1; :::; aNg, and exchange it with ai, so that the i-th cell contains the right value. The process is repeatedly executed until we arrived at the last element. This idea can be illustrated by

932. gure 9.3. insert ... sorted elements ... x ... unsorted elements ... Figure 9.3: The left part is sorted data, continuously pick the minimum element in the rest and put it to the right position. 9.2 Finding the minimum We haven't completely realized the selection sort, because we take the operation of

933. nding the minimum (or the maximum) element as a black box. It's a puzzle how does a kid locate the biggest or the smallest grape. And this is an interesting topic for computer algorithms. The easiest but not so fast way to

934. nd the minimum in a collection is to perform a scan. There are several ways to interpret this scan process. Consider that we want to pick the biggest grape. We start from any grape, compare it with another one, and pick the bigger one; then we take a next grape and compare it with the one we selected so far, pick the bigger one and go on the take-and-compare process, until there are not any grapes we haven't compared. It's easy to get loss in real practice if we don't mark which grape has been compared. There are two ways to to solve this problem, which are suitable for dierent data-structures respectively.

935. 228CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT 9.2.1 Labeling Method 1 is to label each grape with a number: f1; 2; :::;Ng, and we systemat-ically perform the comparison in the order of this sequence of labels. That we

936. rst compare grape number 1 and grape number 2, pick the bigger one; then we take grape number 3, and do the comparison, ... We repeat this process until arrive at grape number N. This is quite suitable for elements stored in an array. function Min(A) m A[1] for i 2 to jAj do if A[i] m then m A[i] return m With Min de

937. ned, we can complete the basic version of selection sort (or naive version without any optimization in terms of time and space). However, this algorithm returns the value of the minimum element instead of its location (or the label of the grape), which needs a bit tweaking for the in-place version. Some languages such as ISO C++, support returning the reference as result, so that the swap can be achieved directly as below. templatetypename T T min(T from, T to) { T m; for (m = from++; from != to; ++from) if (from m) m = from; return m; } templatetypename T void ssort(T xs, int n) { int i; for (i = 0; i n; ++i) std::swap(xs[i], min(xs+i, xs+n)); } In environments without reference semantics, the solution is to return the location of the minimum element instead of the value: function Min-At(A) m First-Index(A) for i m + 1 to jAj do if A[i] A[m] then m i return m Note that since we pass A[i:::] to Min-At as the argument, we assume the

938. rst element A[i] as the smallest one, and examine all elements A[i + 1];A[i + 2]; ::: one by one. Function First-Index() is used to retrieve i from the input parameter. The following Python example program, for example, completes the basic in-place selection sort algorithm based on this idea. It explicitly passes the range information to the function of

939. nding the minimum location.

940. 9.2. FINDING THE MINIMUM 229 def ssort(xs): n = len(xs) for i in range(n): m = min_at(xs, i, n) (xs[i], xs[m]) = (xs[m], xs[i]) return xs def min_at(xs, i, n): m = i; for j in range(i+1, n): if xs[j] xs[m]: m = j return m 9.2.2 Grouping Another method is to group all grapes in two parts: the group we have examined, and the rest we haven't. We denote these two groups as A and B; All the elements (grapes) as L. At the beginning, we haven't examine any grapes at all, thus A is empty (), and B contains all grapes. We can select arbitrary two grapes from B, compare them, and put the loser (the smaller one for example) to A. After that, we repeat this process by continuously picking arbitrary grapes from B, and compare with the winner of the previous time until B becomes empty. At this time being, the

941. nal winner is the minimum element. And A turns to be Lfmin(L)g, which can be used for the next time minimum

942. nding. There is an invariant of this method, that at any time, we have L = A [ fmg [ B, where m is the winner so far we hold. This approach doesn't need the collection of grapes being indexed (as being labeled in method 1). It's suitable for any traversable data structures, including linked-list etc. Suppose b1 is an arbitrary element in B if B isn't empty, and B0 is the rest of elements with b1 being removed, this method can be formalized as the below auxiliary function. 0 (A; m;B) = min 8 : (m;A) : B = min0(A [ fmg; b1;B0) : b1 m min0(A [ fb1g; m;B0) : otherwise (9.2) In order to pick the minimum element, we call this auxiliary function by passing an empty A, and use an arbitrary element (for instance, the

943. rst one) to initialize m: 0 (; l1;L extractMin(L) = min 0 ) (9.3) Where L0 is all elements in L except for the

944. rst one l1. The algorithm extractMin) doesn't not only

945. nd the minimum element, but also returns the updated collection which doesn't contain this minimum. Summarize this mini-mum extracting algorithm up to the basic selection sort de

946. nition, we can create a complete functional sorting program, for example as this Haskell code snippet. sort [] = [] sort xs = x : sort xs' where (x, xs') = extractMin xs

947. 230CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT extractMin (x:xs) = min' [] x xs where min' ys m [] = (m, ys) min' ys m (x:xs) = if m x then min' (x:ys) m xs else min' (m:ys) x xs The

948. rst line handles the trivial edge case that the sorting result for empty list is obvious empty; The second clause ensures that, there is at least one element, that's why the extractMin function needn't other pattern-matching. One may think the second clause of min' function should be written like below: min' ys m (x:xs) = if m x then min' ys ++ [x] m xs else min' ys ++ [m] x xs Or it will produce the updated list in reverse order. Actually, it's necessary to use `cons' instead of appending here. This is because appending is linear operation which is proportion to the length of part A, while `cons' is constant O(1) time operation. In fact, we needn't keep the relative order of the list to be sorted, as it will be re-arranged anyway during sorting. It's quite possible to keep the relative order during sorting, while ensure the performance of

949. nding the minimum element not degrade to quadratic. The following equation de

950. nes a solution. extractMin(L) = 8 : (l1; ) : jLj = 1 (l1;L0) : l1 m; (m;L00) = extractMin(L0) (m; l1 [ L00) : otherwise (9.4) If L is a singleton, the minimum is the only element it contains. Otherwise, denote l1 as the

951. rst element in L, and L0 contains the rest elements except for l1, that L0 = fl2; l3; :::g. The algorithm recursively

952. nding the minimum element in L0, which yields the intermediate result as (m;L00), that m is the minimum element in L0, and L00 contains all rest elements except for m. Comparing l1 with m, we can determine which of them is the

953. nal minimum result. The following Haskell program implements this version of selection sort. sort [] = [] sort xs = x : sort xs' where (x, xs') = extractMin xs extractMin [x] = (x, []) extractMin (x:xs) = if x m then (x, xs) else (m, x:xs') where (m, xs') = extractMin xs Note that only `cons' operation is used, we needn't appending at all because the algorithm actually examines the list from right to left. However, it's not free, as this program need book-keeping the context (via call stack typically). The relative order is ensured by the nature of recursion. Please refer to the appendix about tail recursion call for detailed discussion. 9.2.3 performance of the basic selection sorting Both the labeling method, and the grouping method need examine all the ele-ments to pick the minimum in every round; and we totally pick up the minimum

954. 9.3. MINOR IMPROVEMENT 231 element N times. Thus the performance is around N +(N 1)+(N 2)+:::+1 which is N(N+1) 2 . Selection sort is a quadratic algorithm bound to O(N2) time. Compare to the insertion sort, which we introduced previously, selection sort performs same in its best case, worst case and average case. While insertion sort performs well in best case (that the list has been reverse ordered, and it is stored in linked-list) as O(N), and the worst performance is O(N2). In the next sections, we'll examine, why selection sort performs poor, and try to improve it step by step. Exercise 9.1 Implement the basic imperative selection sort algorithm (the none in-place version) in your favorite programming language. Compare it with the in-place version, and analyze the time and space eectiveness. 9.3 Minor Improvement 9.3.1 Parameterize the comparator Before any improvement in terms of performance, let's make the selection sort algorithm general enough to handle dierent sorting criteria. We've seen two opposite examples so far, that one may need sort the elements in ascending order or descending order. For the former case, we need repeatedly

955. nding the minimum, while for the later, we need

956. nd the maximum instead. They are just two special cases. In real world practice, one may want to sort things in varies criteria, e.g. in terms of size, weight, age, ... One solution to handle them all is to passing the criteria as a compare function to the basic selection sort algorithms. For example: sort(c;L) = : L = m [ sort(c;L00) : otherwise; (m;L00) = extract(c;L0) (9.5) And the algorithm extract(c;L) is de

957. ned as below. extract(c;L) = 8 : (l1; ) : jLj = 1 (l1;L0) : c(l1;m); (m;L00) = extract(c;L0) (m; fl1g [ L00) : :c(l1;m) (9.6) Where c is a comparator function, it takes two elements, compare them and returns the result of which one is preceding of the other. Passing `less than' operator () turns this algorithm to be the version we introduced in previous section. Some environments require to pass the total ordering comparator, which returns result among `less than', 'equal', and 'greater than'. We needn't such strong condition here, that c only tests if `less than' is satis

958. ed. However, as the minimum requirement, the comparator should meet the strict weak ordering as following [16]: Irre exivity, for all x, it's not the case that x x;

959. 232CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT Asymmetric, For all x and y, if x y, then it's not the case y x; Transitivity, For all x, y, and z, if x y, and y z, then x z; The following Scheme/Lisp program translates this generic selection sorting algorithm. The reason why we choose Scheme/Lisp here is because the lexical scope can simplify the needs to pass the `less than' comparator for every function calls. (define (sel-sort-by ltp? lst) (define (ssort lst) (if (null? lst) lst (let ((p (extract-min lst))) (cons (car p) (ssort (cdr p)))))) (define (extract-min lst) (if (null? (cdr lst)) lst (let ((p (extract-min (cdr lst)))) (if (ltp? (car lst) (car p)) lst (cons (car p) (cons (car lst) (cdr p))))))) (ssort lst)) Note that, both ssort and extract-min are inner functions, so that the `less than' comparator ltp? is available to them. Passing `' to this function yields the normal sorting in ascending order: (sel-sort-by '(3 1 2 4 5 10 9)) ;Value 16: (1 2 3 4 5 9 10) It's possible to pass varies of comparator to imperative selection sort as well. This is left as an exercise to the reader. For the sake of brevity, we only consider sorting elements in ascending order in the rest of this chapter. And we'll not pass comparator as a parameter unless it's necessary. 9.3.2 Trivial

960. ne tune The basic in-place imperative selection sorting algorithm iterates all elements, and picking the minimum by traversing as well. It can be written in a compact way, that we inline the minimum

961. nding part as an inner loop. procedure Sort(A) for i 1 to jAj do m i for j i + 1 to jAj do if A[i] A[m] then m i Exchange A[i] $ A[m] Observe that, when we are sorting N elements, after the

962. rst N1 minimum ones are selected, the left only one, is de

963. nitely the N-th big element, so that we need NOT

964. nd the minimum if there is only one element in the list. This indicates that the outer loop can iterate to N 1 instead of N.

965. 9.3. MINOR IMPROVEMENT 233 Another place we can

966. ne tune, is that we needn't swap the elements if the i-th minimum one is just A[i]. The algorithm can be modi

967. ed accordingly as below: procedure Sort(A) for i 1 to jAj 1 do m i for j i + 1 to jAj do if A[i] A[m] then m i if m6= i then Exchange A[i] $ A[m] De

968. nitely, these modi

969. cations won't aects the performance in terms of big- O. 9.3.3 Cock-tail sort Knuth gave an alternative realization of selection sort in [1]. Instead of selecting the minimum each time, we can select the maximum element, and put it to the last position. This method can be illustrated by the following algorithm. procedure Sort'(A) for i jAj down-to 2 do m i for j 1 to i 1 do if A[m] A[i] then m i Exchange A[i] $ A[m] As shown in

970. gure 13.1, at any time, the elements on right most side are sorted. The algorithm scans all unsorted ones, and locate the maximum. Then, put it to the tail of the unsorted range by swapping. swap ... max ... x ... sorted elements ... Figure 9.4: Select the maximum every time and put it to the end. This version reveals the fact that, selecting the maximum element can sort the element in ascending order as well. What's more, we can

971. nd both the minimum and the maximum elements in one pass of traversing, putting the minimum at the

972. rst location, while putting the maximum at the last position. This approach can speed up the sorting slightly (halve the times of the outer loop). procedure Sort(A) for i 1 to b jAj 2 c do min i max jAj + 1 i

973. 234CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT if A[max] A[min] then Exchange A[min] $ A[max] for j i + 1 to jAj i do if A[j] A[min] then min j if A[max] A[j] then max j Exchange A[i] $ A[min] Exchange A[jAj + 1 i] $ A[max] This algorithm can be illustrated as in

974. gure 9.5, at any time, the left most and right most parts contain sorted elements so far. That the smaller sorted ones are on the left, while the bigger sorted ones are on the right. The algorithm scans the unsorted ranges, located both the minimum and the maximum positions, then put them to the head and the tail position of the unsorted ranges by swapping. swap ... sorted small ones ... x ... max ... min ... y ... sorted big ones ... Figure 9.5: Select both the minimum and maximum in one pass, and put them to the proper positions. Note that it's necessary to swap the left most and right most elements before the inner loop if they are not in correct order. This is because we scan the range excluding these two elements. Another method is to initialize the

975. rst element of the unsorted range as both the maximum and minimum before the inner loop. However, since we need two swapping operations after the scan, it's possible that the

976. rst swapping moves the maximum or the minimum from the position we just found, which leads the second swapping malfunctioned. How to solve this problem is left as exercise to the reader. The following Python example program implements this cock-tail sort algo-rithm. def cocktail_sort(xs): n = len(xs) for i in range(n = 2): (mi, ma) = (i, n - 1 -i) if xs[ma] xs[mi]: (xs[mi], xs[ma]) = (xs[ma], xs[mi]) for j in range(i+1, n - 1 - i): if xs[j] xs[mi]: mi = j if xs[ma] xs[j]: ma = j (xs[i], xs[mi]) = (xs[mi], xs[i]) (xs[n - 1 - i], xs[ma]) = (xs[ma], xs[n - 1 - i]) return xs

977. 9.3. MINOR IMPROVEMENT 235 It's possible to realize cock-tail sort in functional approach as well. An intuitive recursive description can be given like this: Trivial edge case: If the list is empty, or there is only one element in the list, the sorted result is obviously the origin list; Otherwise, we select the minimum and the maximum, put them in the head and tail positions, then recursively sort the rest elements. This algorithm description can be formalized by the following equation. sort(L) = L : jLj 1 flming [ sort(L00) [ flmaxg : otherwise (9.7) Where the minimum and the maximum are extracted from L by a function select(L). (lmin;L 00 ; lmax) = select(L) Note that, the minimum is actually linked to the front of the recursive sort result. Its semantic is a constant O(1) time `cons' (refer to the appendix of this book for detail). While the maximum is appending to the tail. This is typically a linear O(N) time expensive operation. We'll optimize it later. Function select(L) scans the whole list to

978. nd both the minimum and the maximum. It can be de

979. ned as below: select(L) = 8 : (min(l1; l2); max(l1; l2)) : L = fl1; l2g (l1; flming [ L00; lmax) : l1 lmin (lmin; flmaxg [ L00; l1) : lmax l1 (lmin; fl1g [ L00; lmax) : otherwise (9.8) Where (lmin;L00; lmax) = select(L0) and L0 is the rest of the list except for the

980. rst element l1. If there are only two elements in the list, we pick the smaller as the minimum, and the bigger as the maximum. After extract them, the list becomes empty. This is the trivial edge case; Otherwise, we take the

981. rst element l1 out, then recursively perform selection on the rest of the list. After that, we compare if l1 is less then the minimum or greater than the maximum candidates, so that we can

982. nalize the result. Note that for all the cases, there is no appending operation to form the result. However, since selection must scan all the element to determine the minimum and the maximum, it is bound to O(N) linear time. The complete example Haskell program is given as the following. csort [] = [] csort [x] = [x] csort xs = mi : csort xs' ++ [ma] where (mi, xs', ma) = extractMinMax xs extractMinMax [x, y] = (min x y, [], max x y) extractMinMax (x:xs) j x mi = (x, mi:xs', ma) j ma x = (mi, ma:xs', x) j otherwise = (mi, x:xs', ma) where (mi, xs', ma) = extractMinMax xs

983. 236CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT We mentioned that the appending operation is expensive in this intuitive version. It can be improved. This can be achieved in two steps. The

984. rst step is to convert the cock-tail sort into tail-recursive call. Denote the sorted small ones as A, and sorted big ones as B in

985. gure 9.5. We use A and B as accumulators. The new cock-tail sort is de

986. ned as the following. sort 0 (A; L;B) = A [ L [ B : L = _ jLj = 1 sort0(A [ flming;L00; flmaxg [ B) : otherwise (9.9) Where lmin, lmax and L00 are de

987. ned as same as before. And we start sorting by passing empty A and B: sort(L) = sort0(; L; ). Besides the edge case, observing that the appending operation only happens on A [ flming; while lmax is only linked to the head of B. This appending occurs in every recursive call. To eliminate it, we can store A in reverse order as A, so that lmax can be `cons' to the head instead of appending. Denote cons(x;L) = fxg [ L and append(L; x) = L [ fxg, we have the below equation. append(L; x) = reverse(cons(x; reverse(L))) = reverse(cons(x; L)) (9.10) Finally, we perform a reverse to turn A back to A. Based on this idea, the algorithm can be improved one more step as the following. sort 0 (A; L;B) = 8 : reverse(A) [ B : L = reverse(fl1g [ A) [ B : jLj = 1 sort0(flming [ A;L00; flmaxg [ B) : (9.11) This algorithm can be implemented by Haskell as below. csort' xs = cocktail [] xs [] where cocktail as [] bs = reverse as ++ bs cocktail as [x] bs = reverse (x:as) ++ bs cocktail as xs bs = let (mi, xs', ma) = extractMinMax xs in cocktail (mi:as) xs' (ma:bs) Exercise 9.2 Realize the imperative basic selection sort algorithm, which can take a comparator as a parameter. Please try both dynamic typed language and static typed language. How to annotate the type of the comparator as general as possible in a static typed language? Implement Knuth's version of selection sort in your favorite programming language. An alternative to realize cock-tail sort is to assume the i-th element both the minimum and the maximum, after the inner loop, the minimum and maximum are found, then we can swap the the minimum to the i-th position, and the maximum to position jAj+1i. Implement this solution in your favorite imperative language. Please note that there are several special edge cases should be handled correctly:

988. 9.4. MAJOR IMPROVEMENT 237 { A = fmax; min; :::g; { A = f:::; max; ming; { A = fmax; :::; ming. Please don't refer to the example source code along with this chapter before you try to solve this problem. 9.4 Major improvement Although cock-tail sort halves the numbers of loop, the performance is still bound to quadratic time. It means that, the method we developed so far handles big data poorly compare to other divide and conquer sorting solutions. To improve selection based sort essentially, we must analyze where is the bottle-neck. In order to sort the elements by comparison, we must examine all the elements for ordering. Thus the outer loop of selection sort is necessary. However, must it scan all the elements every time to select the minimum? Note that when we pick the smallest one at the

989. rst time, we actually traverse the whole collection, so that we know which ones are relative big, and which ones are relative small partially. The problem is that, when we select the further minimum elements, instead of re-using the ordering information we obtained previously, we drop them all, and blindly start a new traverse. So the key point to improve selection based sort is to re-use the previous result. There are several approaches, we'll adopt an intuitive idea inspired by football match in this chapter. 9.4.1 Tournament knock out The football world cup is held every four years. There are 32 teams from dierent continent play the

990. nal games. Before 1982, there were 16 teams compete for the tournament

991. nals[4]. For simpli

992. cation purpose, let's go back to 1978 and imagine a way to de-termine the champion: In the

993. rst round, the teams are grouped into 8 pairs to play the game; After that, there will be 8 winner, and 8 teams will be out. Then in the second round, these 8 teams are grouped into 4 pairs. This time there will be 4 winners after the second round of games; Then the top 4 teams are divided into 2 pairs, so that there will be only two teams left for the

994. nal game. The champion is determined after the total 4 rounds of games. And there are actually 8 + 4 + 2 + 1 = 16 games. Now we have the world cup champion, however, the world cup game won't

995. nish at this stage, we need to determine which is the silver medal team. Readers may argue that isn't the team beaten by the champion at the

996. - nal game the second best? This is true according to the real world cup rule. However, it isn't fair enough in some sense. We often heard about the so called `group of death', Let's suppose that Brazil team is grouped with Deutch team at the very beginning. Although both teams are quite strong, one of them must be knocked out. It's quite possible

997. 238CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT that even the team loss that game can beat all the other teams except for the champion. Figure 9.6 illustrates such case. 1 6 1 6 1 4 1 6 1 3 7 1 6 7 6 1 5 1 6 8 1 3 8 4 1 3 3 1 0 1 4 1 0 9 5 1 0 9 1 1 2 1 4 1 2 2 1 1 1 4 Figure 9.6: The element 15 is knocked out in the

998. rst round. Imagine that every team has a number. The bigger the number, the stronger the team. Suppose that the stronger team always beats the team with smaller number, although this is not true in real world. But this simpli

999. cation is fair enough for us to develop the tournament knock out solution. This maximum number which represents the champion is 16. De

1000. nitely, team with number 14 isn't the second best according to our rules. It should be 15, which is knocked out at the

1001. rst round of comparison. The key question here is to

1002. nd an eective way to locate the second max-imum number in this tournament tree. After that, what we need is to apply the same method to select the third, the fourth, ..., to accomplish the selection based sort. One idea is to assign the champion a very small number (for instance, 1), so that it won't be selected next time, and the second best one, becomes the new champion. However, suppose there are 2m teams for some natural number m, it still takes 2m1+2m2+:::+2+1 = 2m times of comparison to determine the new champion. Which is as slow as the

1003. rst time. Actually, we needn't perform a bottom-up comparison at all since the tour-nament tree stores plenty of ordering information. Observe that, the second best team must be beaten by the champion at sometime, or it will be the

1004. nal winner. So we can track the path from the root of the tournament tree to the leaf of the champion, examine all the teams along with this path to

1005. nd the second best team. In

1006. gure 9.6, this path is marked in gray color, the elements to be examined are f14; 13; 7; 15g. Based on this idea, we re

1007. ne the algorithm like below. 1. Build a tournament tree from the elements to be sorted, so that the cham-pion (the maximum) becomes the root; 2. Extract the root from the tree, perform a top-down pass and replace the maximum with 1; 3. Perform a bottom-up back-track along the path, determine the new cham-pion and make it as the new root; 4. Repeat step 2 until all elements have been extracted. Figure 9.7, 9.8, and 9.9 show the steps of applying this strategy.

1008. 9.4. MAJOR IMPROVEMENT 239 1 5 1 5 1 4 1 5 1 3 7 1 5 7 6 1 5 -INF 8 1 3 8 4 1 3 3 1 0 1 4 1 0 9 5 1 0 9 1 1 2 1 4 1 2 2 1 1 1 4 Figure 9.7: Extract 16, replace it with 1, 15 sifts up to root. 1 4 1 3 1 4 7 1 3 7 -INF 7 6 -INF -INF 8 1 3 8 4 1 3 3 1 0 1 4 1 0 9 5 1 0 9 1 1 2 1 4 1 2 2 1 1 1 4 Figure 9.8: Extract 15, replace it with 1, 14 sifts up to root. 1 3 1 3 1 2 7 1 3 7 -INF 7 6 -INF -INF 8 1 3 8 4 1 3 3 1 0 1 2 1 0 9 5 1 0 9 1 1 2 1 1 1 2 2 1 1 -INF Figure 9.9: Extract 14, replace it with 1, 13 sifts up to root.

1009. 240CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT We can reuse the binary tree de

1010. nition given in the

1011. rst chapter of this book to represent tournament tree. In order to back-track from leaf to the root, every node should hold a reference to its parent (concept of pointer in some environment such as ANSI C): struct Node { Key key; struct Node left, right, parent; }; To build a tournament tree from a list of elements (suppose the number of elements are 2m for some m), we can

1012. rst wrap each element as a leaf, so that we obtain a list of binary trees. We take every two trees from this list, compare their keys, and form a new binary tree with the bigger key as the root; the two trees are set as the left and right children of this new binary tree. Repeat this operation to build a new list of trees. The height of each tree is increased by 1. Note that the size of the tree list halves after such a pass, so that we can keep reducing the list until there is only one tree left. And this tree is the

1013. nally built tournament tree. function Build-Tree(A) T for each x 2 A do t Create-Node Key(t) x Append(T; t) while jTj 1 do T0 for every t1; t2 2 T do t Create-Node Key(t) Max(Key(t1), Key(t2)) Left(t) t1 Right(t) t2 Parent(t1) t Parent(t2) t Append(T0; t) T T0 return T[1] Suppose the length of the list A is N, this algorithm

1014. rstly traverses the list to build tree, which is linear to N time. Then it repeatedly compares pairs, which loops proportion to N + N 2 + N 4 +:::+2 = 2N. So the total performance is bound to O(N) time. The following ANSI C program implements this tournament tree building algorithm. struct Node build(const Key xs, int n) { int i; struct Node t, ts = (struct Node) malloc(sizeof(struct Node) n); for (i = 0; i n; ++i) ts[i] = leaf(xs[i]); for (; n 1; n == 2) for (i = 0; i n; i += 2)

1015. 9.4. MAJOR IMPROVEMENT 241 ts[i=2] = branch(max(ts[i]!key, ts[i+1]!key), ts[i], ts[i+1]); t = ts[0]; free(ts); return t; } The type of key can be de

1016. ned somewhere, for example: typedef int Key; Function leaf(x) creats a leaf node, with value x as key, and sets all its

1017. elds, left, right and parent to NIL. While function branch(key, left, right) creates a branch node, and links the new created node as parent of its two children if they are not empty. For the sake of brevity, we skip the detail of them. They are left as exercise to the reader, and the complete program can be downloaded along with this book. Some programming environments, such as Python provides tool to iterate every two elements at a time, for example: for x, y in zip([iter(ts)]2): We skip such language speci

1018. c feature, readers can refer to the Python ex-ample program along with this book for details. When the maximum element is extracted from the tournament tree, we replace it with 1, and repeatedly replace all these values from the root to the leaf. Next, we back-track to root through the parent

1019. eld, and determine the new maximum element. function Extract-Max(T) m Key(T) Key(T) 1 while : Leaf?(T) do . The top down pass if Key(Left(T)) = m then T Left(T) else T Right(T) Key(T) 1 while Parent(T)6= do . The bottom up pass T Parent(T) Key(T) Max(Key(Left(T)), Key(Right(T))) return m This algorithm returns the extracted maximum element, and modi

1020. es the tournament tree in-place. Because we can't represent 1 in real program by limited length of word, one approach is to de

1021. ne a relative negative big number, which is less than all the elements in the tournament tree, for example, suppose all the elements are greater than -65535, we can de

1022. ne negative in

1023. nity as below: #define N_INF -65535 We can implements this algorithm as the following ANSI C example program. Key pop(struct Node t) { Key x = t!key; t!key = N_INF; while (!isleaf(t)) {

1024. 242CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT t = t!left!key == x ? t!left : t!right; t!key = N_INF; } while (t!parent) { t = t!parent; t!key = max(t!left!key, t!right!key); } return x; } The behavior of Extract-Max is quite similar to the pop operation for some data structures, such as queue, and heap, thus we name it as pop in this code snippet. Algorithm Extract-Max process the tree in tow passes, one is top-down, then a bottom-up along the path that the `champion team wins the world cup'. Because the tournament tree is well balanced, the length of this path, which is the height of the tree, is bound to O(lgN), where N is the number of the elements to be sorted (which are equal to the number of leaves). Thus the performance of this algorithm is O(lgN). It's possible to realize the tournament knock out sort now. We build a tournament tree from the elements to be sorted, then continuously extract the maximum. If we want to sort in monotonically increase order, we put the

1025. rst extracted one to the right most, then insert the further extracted elements one by one to left; Otherwise if we want to sort in decrease order, we can just append the extracted elements to the result. Below is the algorithm sorts elements in ascending order. procedure Sort(A) T Build-Tree(A) for i jAj down to 1 do A[i] Extract-Max(T) Translating it to ANSI C example program is straightforward. void tsort(Key xs, int n) { struct Node t = build(xs, n); while(n) xs[--n] = pop(t); release(t); } This algorithm

1026. rstly takes O(N) time to build the tournament tree, then performs N pops to select the maximum elements so far left in the tree. Since each pop operation is bound to O(lgN), thus the total performance of tourna-ment knock out sorting is O(N lgN). Re

1027. ne the tournament knock out It's possible to design the tournament knock out algorithm in purely functional approach. And we'll see that the two passes (

1028. rst top-down replace the cham-pion with 1, then bottom-up determine the new champion) in pop operation can be combined in recursive manner, so that we needn't the parent

1029. eld any more. We can re-use the functional binary tree de

1030. nition as the following ex-ample Haskell code.

1031. 9.4. MAJOR IMPROVEMENT 243 data Tr a = Empty j Br (Tr a) a (Tr a) Thus a binary tree is either empty or a branch node contains a key, a left sub tree and a right sub tree. Both children are again binary trees. We've use hard coded big negative number to represents 1. However, this solution is ad-hoc, and it forces all elements to be sorted are greater than this pre-de

1032. ned magic number. Some programming environments support algebraic type, so that we can de

1033. ne negative in

1034. nity explicitly. For instance, the below Haskell program setups the concept of in

1035. nity 1. data Infinite a = NegInf j Only a j Inf deriving (Eq, Ord) From now on, we switch back to use the min() function to determine the winner, so that the tournament selects the minimum instead of the maximum as the champion. Denote function key(T) returns the key of the tree rooted at T. Function wrap(x) wraps the element x into a leaf node. Function tree(l; k; r) creates a branch node, with k as the key, l and r as the two children respectively. The knock out process, can be represented as comparing two trees, picking the smaller key as the new key, and setting these two trees as children: branch(T1; T2) = tree(T1; min(key(T1); key(T2)); T2) (9.12) This can be implemented in Haskell word by word: branch t1 t2 = Br t1 (min (key t1) (key t2)) t2 There is limitation in our tournament sorting algorithm so far. It only accepts collection of elements with size of 2m, or we can't build a complete binary tree. This can be actually solved in the tree building process. Remind that we pick two trees every time, compare and pick the winner. This is perfect if there are always even number of trees. Considering a case in football match, that one team is absent for some reason (sever ight delay or whatever), so that there left one team without a challenger. One option is to make this team the winner, so that it will attend the further games. Actually, we can use the similar approach. To build the tournament tree from a list of elements, we wrap every element into a leaf, then start the building process. build(L) = build 0 (fwrap(x)jx 2 Lg) (9.13) The build0(T) function terminates when there is only one tree left in T, which is the champion. This is the trivial edge case. Otherwise, it groups every two trees in a pair to determine the winners. When there are odd numbers of trees, it just makes the last tree as the winner to attend the next level of tournament and recursively repeats the building process. build 0 (T) = T : jTj 1 build0(pair(T)) : otherwise (9.14) 1The order of the de

1036. nition of `NegInf', regular number, and `Inf' is signi

1037. cant if we want to derive the default, correct comparing behavior of `Ord'. Anyway, it's possible to specify the detailed order by make it as an instance of `Ord'. However, this is Language speci

1038. c feature which is out of the scope of this book. Please refer to other textbook about Haskell.

1039. 244CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT Note that this algorithm actually handles another special cases, that the list to be sort is empty. The result is obviously empty. Denote T = fT1; T2; :::g if there are at least two trees, and T0 represents the left trees by removing the

1040. rst two. Function pair(T) is de

1041. ned as the following. pair(T) = fbranch(T1; T2)g [ pair(T0) : jTj 2 T : otherwise (9.15) The complete tournament tree building algorithm can be implemented as the below example Haskell program. fromList :: (Ord a) ) [a] ! Tr (Infinite a) fromList = build (map wrap) where build [] = Empty build [t] = t build ts = build $ pair ts pair (t1:t2:ts) = (branch t1 t2):pair ts pair ts = ts When extracting the champion (the minimum) from the tournament tree, we need examine either the left child sub-tree or the right one has the same key, and recursively extract on that tree until arrive at the leaf node. Denote the left sub-tree of T as L, right sub-tree as R, and K as its key. We can de

1042. ne this popping algorithm as the following. pop(T) = 8 : tree(;1; ) : L = ^ R = tree(L0; min(key(L0); key(R));R) : K = key(L);L0 = pop(L) tree(L; min(key(L); key(R0));R0) : K = key(R);R0 = pop(R) (9.16) It's straightforward to translate this algorithm into example Haskell code. pop (Br Empty _ Empty) = Br Empty Inf Empty pop (Br l k r) j k == key l = let l' = pop l in Br l' (min (key l') (key r)) r j k == key r = let r' = pop r in Br l (min (key l) (key r')) r' Note that this algorithm only removes the current champion without return-ing it. So it's necessary to de

1043. ne a function to get the champion at the root node. top(T) = key(T) (9.17) With these functions de

1044. ned, tournament knock out sorting can be formal-ized by using them. sort(L) = sort 0 (build(L)) (9.18) Where sort0(T) continuously pops the minimum element to form a result list sort 0 (T) = : T = _ key(T) = 1 ftop(T)g [ sort0(pop(T)) : otherwise (9.19) The rest of the Haskell code is given below to complete the implementation.

1045. 9.4. MAJOR IMPROVEMENT 245 top = only key tsort :: (Ord a) ) [a] ! [a] tsort = sort' fromList where sort' Empty = [] sort' (Br _ Inf _) = [] sort' t = (top t) : (sort' $ pop t) And the auxiliary function only, key, wrap accomplished with explicit in-

1046. nity support are list as the following. only (Only x) = x key (Br _ k _ ) = k wrap x = Br Empty (Only x) Empty Exercise 9.3 Implement the helper function leaf(), branch, max() lsleaf(), and release() to complete the imperative tournament tree program. Implement the imperative tournament tree in a programming language support GC (garbage collection). Why can our tournament tree knock out sort algorithm handle duplicated elements (elements with same value)? We say a sorting algorithm stable, if it keeps the original order of elements with same value. Is the tournament tree knock out sorting stable? Design an imperative tournament tree knock out sort algorithm, which satis

1047. es the following: { Can handle arbitrary number of elements; { Without using hard coded negative in

1048. nity, so that it can take ele-ments with any value. Compare the tournament tree knock out sort algorithm and binary tree sort algorithm, analyze eciency both in time and space. Compare the heap sort algorithm and binary tree sort algorithm, and do same analysis for them. 9.4.2 Final improvement by using heap sort We manage improving the performance of selection based sorting to O(N lgN) by using tournament knock out. This is the limit of comparison based sort according to [1]. However, there are still rooms for improvement. After sorting, there lefts a complete binary tree with all leaves and branches hold useless in

1049. nite values. This isn't space ecient at all. Can we release the nodes when popping? Another observation is that if there are N elements to be sorted, we actually allocate about 2N tree nodes. N for leaves and N for branches. Is there any better way to halve the space usage?

1050. 246CHAPTER 9. FROM GRAPE TO THEWORLD CUP, THE EVOLUTION OF SELECTION SORT The

1051. nal sorting structure described in equation 9.19 can be easily uniformed to a more general one if we treat the case that the tree is empty if its root holds in

1052. nity as key: 0 (T) = sort : T = ftop(T)g [ sort0(pop(T)) : otherwise (9.20) This is exactly as same as the one of heap sort we gave in previous chapter. Heap always keeps the minimum (or the maximum) on the top, and provides fast pop operation. The binary heap by implicit array encodes the tree structure in array index, so there aren't any extra spaces allocated except for the N array cells. The functional heaps, such as leftist heap and splay heap allocate N nodes as well. We'll introduce more heaps in next chapter which perform well in many aspects. 9.5 Short summary In this chapter, we present the evolution process of selection based sort. selection sort is easy and commonly used as example to teach students about embedded looping. It has simple and straightforward structure, but the performance is quadratic. In this chapter, we do see that there exists ways to improve it not only by some

1053. ne tuning, but also fundamentally change the data structure, which leads to tournament knock out and heap sort.

1054. Bibliography [1] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [3] Wikipedia. Strict weak order. http://en.wikipedia.org/wiki/Strict weak order [4] Wikipedia. FIFA world cup. http://en.wikipedia.org/wiki/FIFA World Cup 247

1055. 248 Binomial heap, Fibonacci heap, and pairing heap

1056. Chapter 10 Binomial heap, Fibonacci heap, and pairing heap 10.1 Introduction In previous chapter, we mentioned that heaps can be generalized and imple-mented with varies of data structures. However, only binary heaps are focused so far no matter by explicit binary trees or implicit array. It's quite natural to extend the binary tree to K-ary [1] tree. In this chapter, we

1057. rst show Binomial heaps which is actually consist of forest of K-ary trees. Binomial heaps gain the performance for all operations to O(lgN), as well as keeping the

1058. nding minimum element to O(1) time. If we delay some operations in Binomial heaps by using lazy strategy, it turns to be Fibonacci heap. All binary heaps we have shown perform no less than O(lgN) time for merg-ing, we'll show it's possible to improve it to O(1) with Fibonacci heap, which is quite helpful to graph algorithms. Actually, Fibonacci heap achieves almost all operations to good amortized time bound as O(1), and left the heap pop to O(lgN). Finally, we'll introduce about the pairing heaps. It has the best performance in practice although the proof of it is still a conjecture for the time being. 10.2 Binomial Heaps 10.2.1 De

1059. nition Binomial heap is more complex than most of the binary heaps. However, it has excellent merge performance which bound to O(lgN) time. A binomial heap is consist of a list of binomial trees. Binomial tree In order to explain why the name of the tree is `binomial', let's review the famous Pascal's triangle (Also know as the Jia Xian's triangle to memorize the Chinese methematican Jia Xian (1010-1070).) [4]. 249

1060. 250CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 ... In each row, the numbers are all binomial coecients. There are many ways to gain a series of binomial coecient numbers. One of them is by using recursive composition. Binomial trees, as well, can be de

1061. ned in this way as the following. A binomial tree of rank 0 has only a node as the root; A binomial tree of rank N is consist of two rank N 1 binomial trees, Among these 2 sub trees, the one has the bigger root element is linked as the leftmost child of the other. We denote a binomial tree of rank 0 as B0, and the binomial tree of rank n as Bn. Figure 10.1 shows a B0 tree and how to link 2 Bn1 trees to a Bn tree. (a) A B0 tree. rank=n-1 rank=n-1 ... ... (b) Linking 2 Bn1 trees yields a Bn tree. Figure 10.1: Recursive de

1062. nition of binomial trees With this recursive de

1063. nition, it easy to draw the form of binomial trees of rank 0, 1, 2, ..., as shown in

1064. gure 10.2 Observing the binomial trees reveals some interesting properties. For each rank N binomial tree, if counting the number of nodes in each row, it can be found that it is the binomial number. For instance for rank 4 binomial tree, there is 1 node as the root; and in the second level next to root, there are 4 nodes; and in 3rd level, there are 6 nodes;

1065. 10.2. BINOMIAL HEAPS 251 (a) B0 tree; 1 0 (b) B1 tree; 2 1 0 0 (c) B2 tree; 3 2 1 0 1 0 0 0 (d) B3 tree; 4 3 2 1 0 2 1 0 1 0 0 0 1 0 0 0 ... (e) B4 tree; Figure 10.2: Forms of binomial trees with rank = 0, 1, 2, 3, 4, ...

1066. 252CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP and in 4-th level, there are 4 nodes; and the 5-th level, there is 1 node. They are exactly 1, 4, 6, 4, 1 which is the 5th row in Pascal's triangle. That's why we call it binomial tree. Another interesting property is that the total number of node for a binomial tree with rank N is 2N. This can be proved either by binomial theory or the recursive de

1067. nition directly. Binomial heap With binomial tree de

1068. ned, we can introduce the de

1069. nition of binomial heap. A binomial heap is a set of binomial trees (or a forest of binomial trees) that satis

1070. ed the following properties. Each binomial tree in the heap conforms to heap property, that the key of a node is equal or greater than the key of its parent. Here the heap is actually min-heap, for max-heap, it changes to `equal or less than'. In this chapter, we only discuss about min-heap, and max-heap can be equally applied by changing the comparison condition. There is at most one binomial tree which has the rank r. In other words, there are no two binomial trees have the same rank. This de

1071. nition leads to an important result that for a binomial heap contains N elements, and if convert N to binary format yields a0; a1; a2; :::; am, where a0 is the LSB and am is the MSB, then for each 0 i m, if ai = 0, there is no binomial tree of rank i and if ai = 1, there must be a binomial tree of rank i. For example, if a binomial heap contains 5 element, as 5 is `(LSB)101(MSB)', then there are 2 binomial trees in this heap, one tree has rank 0, the other has rank 2. Figure 10.3 shows a binomial heap which have 19 nodes, as 19 is `(LSB)11001(MSB)' in binary format, so there is a B0 tree, a B1 tree and a B4 tree. 1 8 3 3 7 6 8 2 9 1 0 4 4 3 0 2 3 2 2 4 5 3 2 5 5 2 4 4 8 3 1 5 0 1 7 Figure 10.3: A binomial heap with 19 elements

1072. 10.2. BINOMIAL HEAPS 253 Data layout There are two ways to de

1073. ne K-ary trees imperatively. One is by using `left-child, right-sibling' approach[2]. It is compatible with the typical binary tree structure. For each node, it has two

1074. elds, left

1075. eld and right

1076. eld. We use the left

1077. eld to point to the

1078. rst child of this node, and use the right

1079. eld to point to the sibling node of this node. All siblings are represented as a single directional linked list. Figure 10.4 shows an example tree represented in this way. R NIL C1 C2 ... Cn C1’ C2’ ... Cm’ Figure 10.4: Example tree represented in `left-child, right-sibling' way. R is the root node, it has no sibling, so it right side is pointed to NIL. C1;C2; :::;Cn are children of R. C1 is linked from the left side of R, other siblings of C1 are linked one next to each other on the right side of C1. C0 2; :::;C0 m are children of C1. The other way is to use the library de

1080. ned collection container, such as array or list to represent all children of a node. Since the rank of a tree plays very important role, we also de

1081. ned it as a

1082. eld. For `left-child, right-sibling' method, we de

1083. ned the binomial tree as the following.1 class BinomialTree: def __init__(self, x = None): self.rank = 0 self.key = x self.parent = None self.child = None self.sibling = None When initialize a tree with a key, we create a leaf node, set its rank as zero and all other

1084. elds are set as NIL. It quite nature to utilize pre-de

1085. ned list to represent multiple children as below. class BinomialTree: def __init__(self, x = None): self.rank = 0 self.key = x self.parent = None self.children = [] 1C programs are also provided along with this book.

1086. 254CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP For purely functional settings, such as in Haskell language, binomial tree are de

1087. ned as the following. data BiTree a = Node { rank :: Int , root :: a , children :: [BiTree a]} While binomial heap are de

1088. ned as a list of binomial trees (a forest) with ranks in monotonically increase order. And as another implicit constraint, there are no two binomial trees have the same rank. type BiHeap a = [BiTree a] 10.2.2 Basic heap operations Linking trees Before dive into the basic heap operations such as pop and insert, We'll

1089. rst realize how to link two binomial trees with same rank into a bigger one. Accord-ing to the de

1090. nition of binomial tree and heap property that the root always contains the minimum key, we

1091. rstly compare the two root values, select the smaller one as the new root, and insert the other tree as the

1092. rst child in front of all other children. Suppose function Key(T), Children(T), and Rank(T) access the key, children and rank of a binomial tree respectively. link(T1; T2) = node(r + 1; x; fT2g [ C1) : x y node(r + 1; y; fT1g [ C2) : otherwise (10.1) Where x = Key(T1) y = Key(T2) r = Rank(T1) = Rank(T2) C1 = Children(T1) C2 = Children(T2) x y ... ... Figure 10.5: Suppose x y, insert y as the

1093. rst child of x. Note that the link operation is bound to O(1) time if the [ is a constant time operation. It's easy to translate the link function to Haskell program as the following.

1094. 10.2. BINOMIAL HEAPS 255 link :: (Ord a) ) BiTree a ! BiTree a ! BiTree a link t1@(Node r x c1) t2@(Node _ y c2) = if xy then Node (r+1) x (t2:c1) else Node (r+1) y (t1:c2) It's possible to realize the link operation in imperative way. If we use `left child, right sibling' approach, we just link the tree which has the bigger key to the left side of the other's key, and link the children of it to the right side as sibling. Figure 10.6 shows the result of one case. 1: function Link(T1; T2) 2: if Key(T2) Key(T1) then 3: Exchange T1 $ T2 4: Sibling(T2) Child(T1) 5: Child(T1) T2 6: Parent(T2) T1 7: Rank(T1) Rank(T1) + 1 8: return T1 x y ... ... Figure 10.6: Suppose x y, link y to the left side of x and link the original children of x to the right side of y. And if we use a container to manage all children of a node, the algorithm is like below. 1: function Link'(T1; T2) 2: if Key(T2) Key(T1) then 3: Exchange T1 $ T2 4: Parent(T2) T1 5: Insert-Before(Children(T1), T2) 6: Rank(T1) Rank(T1) + 1 7: return T1 It's easy to translate both algorithms to real program. Here we only show the Python program of Link' for illustration purpose 2. def link(t1, t2): if t2.key t1.key: (t1, t2) = (t2, t1) t2.parent = t1 t1.children.insert(0, t2) t1.rank = t1.rank + 1 return t1 2The C and C++ programs are also available along with this book

1095. 256CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP Exercise 10.1 Implement the tree-linking program in your favorite language with left-child, right-sibling method. We mentioned linking is a constant time algorithm and it is true when using left-child, right-sibling approach, However, if we use container to manage the children, the performance depends on the concrete implementation of the con-tainer. If it is plain array, the linking time will be proportion to the number of children. In this chapter, we assume the time is constant. This is true if the container is implemented in linked-list. Insert a new element to the heap (push) As the rank of binomial trees in a forest is monotonically increasing, by using the link function de

1096. ned above, it's possible to de

1097. ne an auxiliary function, so that we can insert a new tree, with rank no bigger than the smallest one, to the heap which is a forest actually. Denote the non-empty heap as H = fT1; T2; :::; Tng, we de

1098. ne insertT (H; T) = 8 : fTg : H = fTg [ H : Rank(T) Rank(T1) insertT (H0; link(T; T1)) : otherwise (10.2) where H 0 = fT2; T3; :::; Tng The idea is that for the empty heap, we set the new tree as the only element to create a singleton forest; otherwise, we compare the ranks of the new tree and the

1099. rst tree in the forest, if they are same, we link them together, and recursively insert the linked result (a tree with rank increased by one) to the rest of the forest; If they are not same, since the pre-condition constraints the rank of the new tree, it must be the smallest, we put this new tree in front of all the other trees in the forest. From the binomial properties mentioned above, there are at most O(lgN) binomial trees in the forest, where N is the total number of nodes. Thus function insertT performs at most O(lgN) times linking, which are all constant time operation. So the performance of insertT is O(lgN). 3 The relative Haskell program is given as below. insertTree :: (Ord a) ) BiHeap a ! BiTree a ! BiHeap a insertTree [] t = [t] insertTree ts@(t':ts') t = if rank t rank t' then t:ts else insertTree ts' (link t t') 3There is interesting observation by comparing this operation with adding two binary numbers. Which will lead to topic of numeric representation[6].

1100. 10.2. BINOMIAL HEAPS 257 With this auxiliary function, it's easy to realize the insertion. We can wrap the new element to be inserted as the only leaf of a tree, then insert this tree to the binomial heap. insert(H; x) = insertT (H; node(0; x; )) (10.3) And we can continuously build a heap from a series of elements by folding. For example the following Haskell de

1101. ne a helper function 'fromList'. fromList = foldl insert [] Since wrapping an element as a singleton tree takes O(1) time, the real work is done in insertT , the performance of binomial heap insertion is bound to O(lgN). The insertion algorithm can also be realized with imperative approach. Algorithm 4 Insert a tree with 'left-child-right-sibling' method. 1: function Insert-Tree(H; T) 2: while H6= ^ Rank(Head(H)) = Rank(T) do 3: (T1;H) Extract-Head(H) 4: T Link(T; T1) 5: Sibling(T) H 6: return T Algorithm 4 continuously linking the

1102. rst tree in a heap with the new tree to be inserted if they have the same rank. After that, it puts the linked-list of the rest trees as the sibling, and returns the new linked-list. If using a container to manage the children of a node, the algorithm can be given in Algorithm 5. Algorithm 5 Insert a tree with children managed by a container. 1: function Insert-Tree'(H; T) 2: while H6= ^ Rank(H[0]) = Rank(T) do 3: T1 Pop(H) 4: T Link(T; T1) 5: Head-Insert(H; T) 6: return H In this algorithm, function Pop removes the

1103. rst tree T1 = H[0] from the forest. And function Head-Insert, insert a new tree before any other trees in the heap, so that it becomes the

1104. rst element in the forest. With either Insert-Tree or Insert-Tree' de

1105. ned. Realize the binomial heap insertion is trivial. Algorithm 6 Imperative insert algorithm 1: function Insert(H; x) 2: return Insert-Tree(H, Node(0; x; )) The following python program implement the insert algorithm by using a container to manage sub-trees. the `left-child, right-sibling' program is left as an exercise.

1106. 258CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP def insert_tree(ts, t): while ts !=[] and t.rank == ts[0].rank: t = link(t, ts.pop(0)) ts.insert(0, t) return ts def insert(h, x): return insert_tree(h, BinomialTree(x)) Exercise 10.2 Write the insertion program in your favorite imperative programming lan-guage by using the `left-child, right-sibling' approach. Merge two heaps When merge two binomial heaps, we actually try to merge two forests of bino-mial trees. According to the de

1107. nition, there can't be two trees with the same rank and the ranks are in monotonically increasing order. Our strategy is very similar to merge sort. That in every iteration, we take the

1108. rst tree from each forest, compare their ranks, and pick the smaller one to the result heap; if the ranks are equal, we then perform linking to get a new tree, and recursively insert this new tree to the result of merging the rest trees. Figure 10.7 illustrates the idea of this algorithm. This method is dierent from the one given in [2]. We can formalize this idea with a function. For non-empty cases, we denote the two heaps as H1 = fT1; T2; :::g and H2 = fT0 1; T0 2; :::g. Let H0 1 = fT2; T3; :::g and H0 2 = fT0 2; T0 3; :::g. merge(H1;H2) = 8 : H1 : H2 = H2 : H1 = fT1g [ merge(H0 1;H2) : Rank(T1) Rank(T0 1) fT0 1 g [ merge(H1;H0 2) : Rank(T1) Rank(T0 1) insertT (merge(H0 1;H0 2); link(T1; T0 1)) : otherwise (10.4) To analysis the performance of merge, suppose there are m1 trees in H1, and m2 trees in H2. There are at most m1 + m2 trees in the merged result. If there are no two trees have the same rank, the merge operation is bound to O(m1 +m2). While if there need linking for the trees with same rank, insertT performs at most O(m1 + m2) time. Consider the fact that m1 = 1 + blgN1c and m2 = 1 + blgN2c, where N1, N2 are the numbers of nodes in each heap, and blgN1c + blgN2c 2blgNc, where N = N1 + N2, is the total number of nodes. the

1109. nal performance of merging is O(lgN). Translating this algorithm to Haskell yields the following program. merge:: (Ord a) ) BiHeap a ! BiHeap a ! BiHeap a merge ts1 [] = ts1 merge [] ts2 = ts2 merge ts1@(t1:ts1') ts2@(t2:ts2') j rank t1 rank t2 = t1:(merge ts1' ts2) j rank t1 rank t2 = t2:(merge ts1 ts2') j otherwise = insertTree (merge ts1' ts2') (link t1 t2)

1110. 10.2. BINOMIAL HEAPS 259 t 1 ... t 2 ... Rank(t1)Rank(t2)? the smaller T1 T2 ... Ti ... (a) Pick the tree with smaller rank to the result. t 1 ... Rank(t1)=Rank(t2) merge rest t 2 ... link(t1, t2) Ti T1 T2 ... + insert (b) If two trees have same rank, link them to a new tree, and recursively insert it to the merge result of the rest. Figure 10.7: Merge two heaps.

1111. 260CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP Merge algorithm can also be described in imperative way as shown in algo-rithm 7. Algorithm 7 imperative merge two binomial heaps 1: function Merge(H1;H2) 2: if H1 = then 3: return H2 4: if H2 = then 5: return H1 6: H 7: while H16= ^ H26= do 8: T 9: if Rank(H1) Rank(H2) then 10: (T;H1) Extract-Head(H1) 11: else if Rank(H2) Rank(H1) then 12: (T;H2) Extract-Head(H2) 13: else . Equal rank 14: (T1;H1) Extract-Head(H1) 15: (T2;H2) Extract-Head(H2) 16: T Link(T1; T2) 17: Append-Tree(H; T) 18: if H16= then 19: Append-Trees(H;H1) 20: if H26= then 21: Append-Trees(H;H2) 22: return H Since both heaps contain binomial trees with rank in monotonically increas-ing order. Each iteration, we pick the tree with smallest rank and append it to the result heap. If both trees have same rank we perform linking

1112. rst. Consider the Append-Tree algorithm, The rank of the new tree to be appended, can't be less than any other trees in the result heap according to our merge strategy, however, it might be equal to the rank of the last tree in the result heap. This can happen if the last tree appended are the result of linking, which will increase the rank by one. In this case, we must link the new tree to be inserted with the last tree. In below algorithm, suppose function Last(H) refers to the last tree in a heap, and Append(H; T) just appends a new tree at the end of a forest. 1: function Append-Tree(H; T) 2: if H6= ^ Rank(T) = Rank(Last(H)) then 3: Last(H) Link(T, Last(H)) 4: else 5: Append(H; T) Function Append-Trees repeatedly call this function, so that it can append all trees in a heap to the other heap. 1: function Append-Trees(H1;H2) 2: for each T 2 H2 do 3: H1 Append-Tree(H1; T)

1113. 10.2. BINOMIAL HEAPS 261 The following Python program translates the merge algorithm. def append_tree(ts, t): if ts != [] and ts[-1].rank == t.rank: ts[-1] = link(ts[-1], t) else: ts.append(t) return ts def append_trees(ts1, ts2): return reduce(append_tree, ts2, ts1) def merge(ts1, ts2): if ts1 == []: return ts2 if ts2 == []: return ts1 ts = [] while ts1 != [] and ts2 != []: t = None if ts1[0].rank ts2[0].rank: t = ts1.pop(0) elif ts2[0].rank ts1[0].rank: t = ts2.pop(0) else: t = link(ts1.pop(0), ts2.pop(0)) ts = append_tree(ts, t) ts = append_trees(ts, ts1) ts = append_trees(ts, ts2) return ts Exercise 10.3 The program given above uses a container to manage sub-trees. Implement the merge algorithm in your favorite imperative programming language with `left-child, right-sibling' approach. Pop Among the forest which forms the binomial heap, each binomial tree conforms to heap property that the root contains the minimum element in that tree. However, the order relationship of these roots can be arbitrary. To

1114. nd the minimum element in the heap, we can select the smallest root of these trees. Since there are lgN binomial trees, this approach takes O(lgN) time. However, after we locate the minimum element (which is also know as the top element of a heap), we need remove it from the heap and keep the binomial property to accomplish heap-pop operation. Suppose the forest forms the bino-mial heap consists trees of Bi;Bj ; :::;Bp; :::;Bm, where Bk is a binomial tree of rank k, and the minimum element is the root of Bp. If we delete it, there will be p children left, which are all binomial trees with ranks p 1; p 2; :::; 0. One tool at hand is that we have de

1115. ned O(lgN) merge function. A possible approach is to reverse the p children, so that their ranks change to monotonically increasing order, and forms a binomial heap Hp. The rest of trees is still a

1116. 262CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP binomial heap, we represent it as H0 = H Bp. Merging Hp and H0 given the

1117. nal result of pop. Figure 10.8 illustrates this idea. Figure 10.8: Pop the minimum element from a binomial heap. In order to realize this algorithm, we

1118. rst need to de

1119. ne an auxiliary function, which can extract the tree contains the minimum element at root from the forest. extractMin(H) = 8 : (T; ) : H is a singleton as fTg (T1;H0) : Root(T1) Root(T0) (T0; fT1g [ H00) : otherwise (10.5) where H = fT1; T2; :::g for the non-empty forest case; H0 = fT2; T3; :::g is the forest without the

1120. rst tree; (T0;H00) = extractMin(H0) The result of this function is a tuple. The

1121. rst part is the tree which has the minimum element at root, the second part is the rest of the trees after remove the

1122. rst part from the forest. This function examine each of the trees in the forest thus is bound to O(lgN) time. The relative Haskell program can be give respectively. extractMin :: (Ord a) ) BiHeap a ! (BiTree a, BiHeap a)

1123. 10.2. BINOMIAL HEAPS 263 extractMin [t] = (t, []) extractMin (t:ts) = if root t root t' then (t, ts) else (t', t:ts') where (t', ts') = extractMin ts With this function de

1124. ned, to return the minimum element is trivial. findMin :: (Ord a) ) BiHeap a ! a findMin = root fst. extractMin Of course, it's possible to just traverse forest and pick the minimum root without remove the tree for this purpose. Below imperative algorithm describes it with `left child, right sibling' approach. 1: function Find-Minimum(H) 2: T Head(H) 3: min 1 4: while T6= do 5: if Key(T) min then 6: min Key(T) 7: T Sibling(T) 8: return min While if we manage the children with collection containers, the link list traversing is abstracted as to

1125. nd the minimum element among the list. The following Python program shows about this situation. def find_min(ts): min_t = min(ts, key=lambda t: t.key) return min_t.key Next we de

1126. ne the function to delete the minimum element from the heap by using extractMin. delteMin(H) = merge(reverse(Children(T));H 0 ) (10.6) where (T;H 0 ) = extractMin(H) Translate the formula to Haskell program is trivial and we'll skip it. To realize the algorithm in procedural way takes extra eorts including list reversing etc. We left these details as exercise to the reader. The following pseudo code illustrate the imperative pop algorithm 1: function Extract-Min(H) 2: (Tmin;H) Extract-Min-Tree(H) 3: H Merge(H, Reverse(Children(Tmin))) 4: return (Key(Tmin), H) With pop operation de

1127. ned, we can realize heap sort by creating a binomial heap from a series of numbers, than keep popping the smallest number from the heap till it becomes empty. sort(xs) = heapSort(fromList(xs)) (10.7)

1128. 264CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP And the real work is done in function heapSort. heapSort(H) = : H = ffindMin(H)g [ heapSort(deleteMin(H)) : otherwise (10.8) Translate to Haskell yields the following program. heapSort :: (Ord a) ) [a] ! [a] heapSort = hsort fromList where hsort [] = [] hsort h = (findMin h):(hsort $ deleteMin h) Function fromList can be de

1129. ned by folding. Heap sort can also be expressed in procedural way respectively. Please refer to previous chapter about binary heap for detail. Exercise 10.4 Write the program to return the minimum element from a binomial heap in your favorite imperative programming language with 'left-child, right-sibling' approach. Realize the Extract-Min-Tree() Algorithm. For 'left-child, right-sibling' approach, reversing all children of a tree is actually reversing a single-direct linked-list. Write a program to reverse such linked-list in your favorite imperative programming language. More words about binomial heap As what we have shown that insertion and merge are bound to O(lgN) time. The results are all ensure for the worst case. The amortized performance are O(1). We skip the proof for this fact. 10.3 Fibonacci Heaps It's interesting that why the name is given as `Fibonacci heap'. In fact, there is no direct connection from the structure design to Fibonacci series. The inventors of `Fibonacci heap', Michael L. Fredman and Robert E. Tarjan, utilized the property of Fibonacci series to prove the performance time bound, so they decided to use Fibonacci to name this data structure.[2] 10.3.1 De

1130. nition Fibonacci heap is essentially a lazy evaluated binomial heap. Note that, it doesn't mean implementing binomial heap in lazy evaluation settings, for in-stance Haskell, brings Fibonacci heap automatically. However, lazy evaluation setting does help in realization. For example in [5], presents a elegant imple-mentation. Fibonacci heap has excellent performance theoretically. All operations ex-cept for pop are bound to amortized O(1) time. In this section, we'll give an

1131. 10.3. FIBONACCI HEAPS 265 algorithm dierent from some popular textbook[2]. Most of the ideas present here are based on Okasaki's work[6]. Let's review and compare the performance of binomial heap and Fibonacci heap (more precisely, the performance goal of Fibonacci heap). operation Binomial heap Fibonacci heap insertion O(lgN) O(1) merge O(lgN) O(1) top O(lgN) O(1) pop O(lgN) amortized O(lgN) Consider where is the bottleneck of inserting a new element x to binomial heap. We actually wrap x as a singleton leaf and insert this tree into the heap which is actually a forest. During this operation, we inserted the tree in monotonically increasing order of rank, and once the rank is equal, recursively linking and inserting will happen, which lead to the O(lgN) time. As the lazy strategy, we can postpone the ordered-rank insertion and merging operations. On the contrary, we just put the singleton leaf to the forest. The problem is that when we try to

1132. nd the minimum element, for example the top operation, the performance will be bad, because we need check all trees in the forest, and there aren't only O(lgN) trees. In order to locate the top element in constant time, we must remember where is the tree contains the minimum element as root. Based on this idea, we can reuse the de

1133. nition of binomial tree and give the de

1134. nition of Fibonacci heap as the following Haskell program for example. data BiTree a = Node { rank :: Int , root :: a , children :: [BiTree a]} The Fibonacci heap is either empty or a forest of binomial trees with the minimum element stored in a special one explicitly. data FibHeap a = E j FH { size :: Int , minTree :: BiTree a , trees :: [BiTree a]} For convenient purpose, we also add a size

1135. eld to record how many elements are there in a heap. The data layout can also be de

1136. ned in imperative way as the following ANSI C code. struct node{ Key key; struct node next, prev, parent, children; int degree; = As known as rank = int mark; }; struct FibHeap{ struct node roots; struct node minTr; int n; = number of nodes = };

1137. 266CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP For generality, Key can be a customized type, we use integer for illustration purpose. typedef int Key; In this chapter, we use the circular doubly linked-list for imperative settings to realize the Fibonacci Heap as described in [2]. It makes many operations easy and fast. Note that, there are two extra

1138. elds added. A degree also known as rank for a node is the number of children of this node; Flag mark is used only in decreasing key operation. It will be explained in detail in later section. 10.3.2 Basic heap operations As we mentioned that Fibonacci heap is essentially binomial heap implemented in a lazy evaluation strategy, we'll reuse many algorithms de

1139. ned for binomial heap. Insert a new element to the heap Recall the insertion algorithm of binomial tree. It can be treated as a special case of merge operation, that one heap contains only a singleton tree. So that the inserting algorithm can be de

1140. ned by means of merging. insert(H; x) = merge(H; singleton(x)) (10.9) where singleton is an auxiliary function to wrap an element to a one-leaf-tree. singleton(x) = FibHeap(1; node(1; x; ); ) Note that function FibHeap() accepts three parameters, a size value, which is 1 for this one-leaf-tree, a special tree which contains the minimum element as root, and a list of other binomial trees in the forest. The meaning of function node() is as same as before, that it creates a binomial tree from a rank, an element, and a list of children. Insertion can also be realized directly by appending the new node to the forest and updated the record of the tree which contains the minimum element. 1: function Insert(H; k) 2: x Singleton(k) . Wrap x to a node 3: append x to root list of H 4: if Tmin(H) = NIL _ k Key(Tmin(H)) then 5: Tmin(H) x 6: n(H) n(H)+1 Where function Tmin() returns the tree which contains the minimum element at root. The following C source snippet is a translation for this algorithm. struct FibHeap insert_node(struct FibHeap h, struct node x){ h = add_tree(h, x); if(h!minTr == NULL j j x!key h!minTr!key) h!minTr = x; h!n++; return h; }

1141. 10.3. FIBONACCI HEAPS 267 Exercise 10.5 Implement the insert algorithm in your favorite imperative programming language completely. This is also an exercise to circular doubly linked list ma-nipulation. Merge two heaps Dierent with the merging algorithm of binomial heap, we post-pone the linking operations later. The idea is to just put all binomial trees from each heap together, and choose one special tree which record the minimum element for the result heap. merge(H1;H2) = 8 : H1 : H2 = H2 : H1 = FibHeap(s1 + s2; T1min; fT2min g [ T1 [ T2) : root(T1min) root(T2min) FibHeap(s1 + s2; T2min; fT1min g [ T1 [ T2) : otherwise (10.10) where s1 and s2 are the size of H1 and H2; T1min and T2min are the spe-cial trees with minimum element as root in H1 and H2 respectively; T1 = fT11; T12; :::g is a forest contains all other binomial trees in H1; while T2 has the same meaning as T1 except that it represents the forest in H2. Function root(T) return the root element of a binomial tree. Note that as long as the [ operation takes constant time, these merge al-gorithm is bound to O(1). The following Haskell program is the translation of this algorithm. merge:: (Ord a) ) FibHeap a ! FibHeap a ! FibHeap a merge h E = h merge E h = h merge h1@(FH sz1 minTr1 ts1) h2@(FH sz2 minTr2 ts2) j root minTr1 root minTr2 = FH (sz1+sz2) minTr1 (minTr2:ts2++ts1) j otherwise = FH (sz1+sz2) minTr2 (minTr1:ts1++ts2) Merge algorithm can also be realized imperatively by concatenating the root lists of the two heaps. 1: function Merge(H1;H2) 2: H 3: Root(H) Concat(Root(H1), Root(H2)) 4: if Key(Tmin(H1)) Key(Tmin(H2)) then 5: Tmin(H) Tmin(H1) 6: else 7: Tmin(H) Tmin(H2) n(H) = n(H1) + n(H2) 8: release H1 and H2 9: return H This function assumes neither H1, nor H2 is empty. And it's easy to add handling to these special cases as the following ANSI C program. struct FibHeap merge(struct FibHeap h1, struct FibHeap h2){ struct FibHeap h; if(is_empty(h1))

1142. 268CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP return h2; if(is_empty(h2)) return h1; h = empty(); h!roots = concat(h1!roots, h2!roots); if(h1!minTr!key h2!minTr!key) h!minTr = h1!minTr; else h!minTr = h2!minTr; h!n = h1!n + h2!n; free(h1); free(h2); return h; } With merge function de

1143. ned, the O(1) insertion algorithm is realized as well. And we can also give the O(1) time top function as below. top(H) = root(Tmin) (10.11) Exercise 10.6 Implement the circular doubly linked list concatenation function in your favorite imperative programming language. Extract the minimum element from the heap (pop) The pop (delete the minimum element) operation is the most complex one in Fibonacci heap. Since we postpone the tree consolidation in merge algorithm. We have to compensate it somewhere. Pop is the only place left as we have de

1144. ned, insert, merge, top already. There is an elegant procedural algorithm to do the tree consolidation by using an auxiliary array[2]. We'll show it later in imperative approach section. In order to realize the purely functional consolidation algorithm, let's

1145. rst consider a similar number puzzle. Given a list of numbers, such as f2; 1; 1; 4; 8; 1; 1; 2; 4g, we want to add any two values if they are same. And repeat this procedure till all numbers are unique. The result of the example list should be f8; 16g for instance. One solution to this problem will as the following. consolidate(L) = fold(meld; ;L) (10.12) Where fold() function is de

1146. ned to iterate all elements from a list, applying a speci

1147. ed function to the intermediate result and each element. it is sometimes called as reducing. Please refer to the chapter of binary search tree for it. L = fx1; x2; :::; xng, denotes a list of numbers; and we'll use L0 = fx2; x3; :::; xng to represent the rest of the list with the

1148. rst element removed. Function meld() is de

1149. ned as below. meld(L; x) = 8 : fxg : L = meld(L0; x + x1) : x = x1 fxg [ L : x x1 fx1g [ meld(L0; x) : otherwise (10.13)

1150. 10.3. FIBONACCI HEAPS 269 Table 10.1: Steps of consolidate numbers number intermediate result result 2 2 2 1 1, 2 1, 2 1 (1+1), 2 4 4 (4+4) 8 8 (8+8) 16 1 1, 16 1, 16 1 (1+1), 16 2, 16 2 (2+2), 16 4, 16 4 (4+4), 16 8, 16 The consolidate() function works as the follows. It maintains an ordered result list L, contains only unique numbers, which is initialized from an empty list . Each time it process an element x, it

1151. rstly check if the

1152. rst element in L is equal to x, if so, it will add them together (which yields 2x), and repeatedly check if 2x is equal to the next element in L. This process won't stop until either the element to be melt is not equal to the head element in the rest of the list, or the list becomes empty. Table 10.1 illustrates the process of consolidating num-ber sequence f2; 1; 1; 4; 8; 1; 1; 2; 4g. Column one lists the number 'scanned' one by one; Column two shows the intermediate result, typically the new scanned number is compared with the

1153. rst number in result list. If they are equal, they are enclosed in a pair of parentheses; The last column is the result of meld, and it will be used as the input to next step processing. The Haskell program can be give accordingly. consolidate = foldl meld [] where meld [] x = [x] meld (x':xs) x j x == x' = meld xs (x+x') j x x' = x:x':xs j otherwise = x': meld xs x We'll analyze the performance of consolidation as a part of pop operation in later section. The tree consolidation is very similar to this algorithm except it performs based on rank. The only thing we need to do is to modify meld() function a bit, so that it compare on ranks and do linking instead of adding. meld(L; x) = 8 : fxg : L = meld(L0; link(x; x1)) : rank(x) = rank(x1) fxg [ L : rank(x) rank(x1) fx1g [ meld(L0; x) : otherwise (10.14) The

1154. nal consolidate Haskell program changes to the below version. consolidate :: (Ord a) ) [BiTree a] ! [BiTree a] consolidate = foldl meld [] where meld [] t = [t] meld (t':ts) t j rank t == rank t' = meld ts (link t t') j rank t rank t' = t:t':ts j otherwise = t' : meld ts t

1155. 270CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP Figure 10.9 and 10.10 show the steps of consolidation when processing a Fibonacci Heap contains dierent ranks of trees. Comparing with table 10.1 reveals the similarity. a b c d e f g i h q j k m r s l n o p t u v w x (a) Before consolidation c a b (b) Step 1, 2 a b c d (c) Step 3, 'd' is

1156. rstly linked to 'c', then repeatedly linked to 'a'. a b c e d f g h (d) Step 4 Figure 10.9: Steps of consolidation After we merge all binomial trees, including the special tree record for the minimum element in root, in a Fibonacci heap, the heap becomes a Binomial heap. And we lost the special tree, which gives us the ability to return the top element in O(1) time. It's necessary to perform a O(lgN) time search to resume the special tree. We can reuse the function extractMin() de

1157. ned for Binomial heap. It's time to give the

1158. nal pop function for Fibonacci heap as all the sub problems have been solved. Let Tmin denote the special tree in the heap to record the minimum element in root; T denote the forest contains all the other trees except for the special tree, s represents the size of a heap, and function children() returns all sub trees except the root of a binomial tree. deleteMin(H) = : T = ^ children(Tmin) = FibHeap(s 1; T0 min;T0) : otherwise (10.15) Where

1159. 10.3. FIBONACCI HEAPS 271 a b c e i d f g h j k m l n o p (a) Step 5 q a b c e i d f g h j k m l n o p (b) Step 6 q r s a t b c e i d f g h j k m l n o p (c) Step 7, 8, 'r' is

1160. rstly linked to 'q', then 's' is linked to 'q'. Figure 10.10: Steps of consolidation

1161. 272CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP (T 0 min;T0 ) = extractMin(consolidate(children(Tmin) [ T)) Translate to Haskell yields the below program. deleteMin :: (Ord a) ) FibHeap a ! FibHeap a deleteMin (FH _ (Node _ x []) []) = E deleteMin h@(FH sz minTr ts) = FH (sz-1) minTr' ts' where (minTr', ts') = extractMin $ consolidate (children minTr ++ ts) The main part of the imperative realization is similar. We cut all children of Tmin and append them to root list, then perform consolidation to merge all trees with the same rank until all trees are unique in term of rank. 1: function Delete-Min(H) 2: x Tmin(H) 3: if x6= NIL then 4: for each y 2 Children(x) do 5: append y to root list of H 6: Parent(y) NIL 7: remove x from root list of H 8: n(H) n(H) - 1 9: Consolidate(H) 10: return x Algorithm Consolidate utilizes an auxiliary array A to do the merge job. Array A[i] is de

1162. ned to store the tree with rank (degree) i. During the traverse of root list, if we meet another tree of rank i, we link them together to get a new tree of rank i + 1. Next we clean A[i], and check if A[i + 1] is empty and perform further linking if necessary. After we

1163. nish traversing all roots, array A stores all result trees and we can re-construct the heap from it. 1: function Consolidate(H) 2: D Max-Degree(n(H)) 3: for i 0 to D do 4: A[i] NIL 5: for each x 2 root list of H do 6: remove x from root list of H 7: d Degree(x) 8: while A[d]6= NIL do 9: y A[d] 10: x Link(x; y) 11: A[d] NIL 12: d d + 1 13: A[d] x 14: Tmin(H) NIL . root list is NIL at the time 15: for i 0 to D do 16: if A[i]6= NIL then 17: append A[i] to root list of H. 18: if Tmin = NIL_ Key(A[i]) Key(Tmin(H)) then 19: Tmin(H) A[i] The only unclear sub algorithm is Max-Degree, which can determine the upper bound of the degree of any node in a Fibonacci Heap. We'll delay the

1164. 10.3. FIBONACCI HEAPS 273 realization of it to the last sub section. Feed a Fibonacci Heap shown in Figure 10.9 to the above algorithm, Figure 10.11, 10.12 and 10.13 show the result trees stored in auxiliary array A in every steps. A[0] A[1] A[2] A[3] A[4] c a b (a) Step 1, 2 A[0] A[1] A[2] A[3] A[4] a b c d (b) Step 3, Since A06= NIL, 'd' is

1165. rstly linked to 'c', and clear A0 to NIL. Again, as A16= NIL, 'c' is linked to 'a' and the new tree is stored in A2. A[0] A[1] A[2] A[3] A[4] a b c e d f g h (c) Step 4 Figure 10.11: Steps of consolidation Translate the above algorithm to ANSI C yields the below program. void consolidate(struct FibHeap h){ if(!h!roots) return; int D = max_degree(h!n)+1; struct node x, y; struct node a = (struct node)malloc(sizeof(struct node)(D+1)); int i, d; for(i=0; iD; ++i) a[i] = NULL; while(h!roots){ x = h!roots; h!roots = remove_node(h!roots, x); d= x!degree; while(a[d]){ y = a[d]; = Another node has the same degree as x = x = link(x, y); a[d++] = NULL; } a[d] = x; } h!minTr = h!roots = NULL; for(i=0; iD; ++i) if(a[i]){

1166. 274CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP A[0] A[1] A[2] A[3] A[4] a b c e i d f g h j k m l n o p (a) Step 5 A[0] A[1] A[2] A[3] A[4] q a b c e i d f g h j k m l n o p (b) Step 6 Figure 10.12: Steps of consolidation

1167. 10.3. FIBONACCI HEAPS 275 A[0] A[1] A[2] A[3] A[4] q a r s t b c e i d f g h j k m l n o p (a) Step 7, 8, Since A06= NIL, 'r' is

1168. rstly linked to 'q', and the new tree is stored in A1 (A0 is cleared); then 's' is linked to 'q', and stored in A2 (A1 is cleared). Figure 10.13: Steps of consolidation h!roots = append(h!roots, a[i]); if(h!minTr == NULL j j a[i]!key h!minTr!key) h!minTr = a[i]; } free(a); } Exercise 10.7 Implement the remove function for circular doubly linked list in your favorite imperative programming language. 10.3.3 Running time of pop In order to analyze the amortize performance of pop, we adopt potential method. Reader can refer to [2] for a formal de

1169. nition. In this chapter, we only give a intuitive illustration. Remind the gravity potential energy, which is de

1170. ned as E = M g h Suppose there is a complex process, which moves the object with mass M up and down, and

1171. nally the object stop at height h0. And if there exists friction resistance Wf , We say the process works the following power. W = M g (h 0 h) +Wf Figure 10.14 illustrated this concept.

1172. 276CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP Figure 10.14: Gravity potential energy. We treat the Fibonacci heap pop operation in a similar way, in order to evaluate the cost, we

1173. rstly de

1174. ne the potential (H) before extract the mini-mum element. This potential is accumulated by insertion and merge operations executed so far. And after tree consolidation and we get the result H0, we then calculate the new potential (H0). The dierence between (H0) and (H) plus the contribution of consolidate algorithm indicates the amortized performance of pop. For pop operation analysis, the potential can be de

1175. ned as (H) = t(H) (10.16) Where t(H) is the number of trees in Fibonacci heap forest. We have t(H) = 1 + length(T) for any non-empty heap. For the N-nodes Fibonacci heap, suppose there is an upper bound of ranks for all trees as D(N). After consolidation, it ensures that the number of trees in the heap forest is at most D(N) + 1. Before consolidation, we actually did another important thing, which also contribute to running time, we removed the root of the minimum tree, and concatenate all children left to the forest. So consolidate operation at most processes D(N) + t(H) 1 trees. Summarize all the above factors, we deduce the amortized cost as below. T = Tconsolidation + (H0) (H) = O(D(N) + t(H) 1) + (D(N) + 1) t(H) = O(D(N)) (10.17) If only insertion, merge, and pop function are applied to Fibonacci heap. We ensure that all trees are binomial trees. It is easy to estimate the upper limit D(N) if O(lgN). (Suppose the extreme case, that all nodes are in only one Binomial tree). However, we'll show in next sub section that, there is operation can violate the binomial tree assumption.

1176. 10.3. FIBONACCI HEAPS 277 10.3.4 Decreasing key There is a special heap operation left. It only makes sense for imperative set-tings. It's about decreasing key of a certain node. Decreasing key plays impor-tant role in some Graphic algorithms such as Minimum Spanning tree algorithm and Dijkstra's algorithm [2]. In that case we hope the decreasing key takes O(1) amortized time. However, we can't de

1177. ne a function like Decrease(H; k; k0), which

1178. rst lo-cates a node with key k, then decrease k to k0 by replacement, and then resume the heap properties. This is because the time for locating phase is bound to O(N) time, since we don't have a pointer to the target node. In imperative setting, we can de

1179. ne the algorithm as Decrease-Key(H; x; k). Here x is a node in heap H, which we want to decrease its key to k. We needn't perform a search, as we have x at hand. It's possible to give an amortized O(1) solution. When we decreased the key of a node, if it's not a root, this operation may violate the property Binomial tree that the key of parent is less than all keys of children. So we need to compare the decreased key with the parent node, and if this case happens, we can cut this node and append it to the root list. (Remind the recursive swapping solution for binary heap which leads to O(lgN)) x ... ... r y ... x ... @@ Figure 10.15: x y, cut tree x from its parent, and add x to root list. Figure 10.15 illustrates this situation. After decreasing key of node x, it is less than y, we cut x o its parent y, and 'past' the whole tree rooted at x to root list. Although we recover the property of that parent is less than all children, the tree isn't any longer a Binomial tree after it losses some sub tree. If a tree losses too many of its children because of cutting, we can't ensure the performance of merge-able heap operations. Fibonacci Heap adds another constraints to avoid such problem: If a node losses its second child, it is immediately cut from parent, and added to root list

1180. 278CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP The

1181. nal Decrease-Key algorithm is given as below. 1: function Decrease-Key(H; x; k) 2: Key(x) k 3: p Parent(x) 4: if p6= NIL ^ k Key(p) then 5: Cut(H; x) 6: Cascading-Cut(H; p) 7: if k Key(Tmin(H)) then 8: Tmin(H) x Where function Cascading-Cut uses the mark to determine if the node is losing the second child. the node is marked after it losses the

1182. rst child. And the mark is cleared in Cut function. 1: function Cut(H; x) 2: p Parent(x) 3: remove x from p 4: Degree(p) Degree(p) - 1 5: add x to root list of H 6: Parent(x) NIL 7: Mark(x) FALSE During cascading cut process, if x is marked, which means it has already lost one child. We recursively performs cut and cascading cut on its parent till reach to root. 1: function Cascading-Cut(H; x) 2: p Parent(x) 3: if p6= NIL then 4: if Mark(x) = FALSE then 5: Mark(x) TRUE 6: else 7: Cut(H; x) 8: Cascading-Cut(H; p) The relevant ANSI C decreasing key program is given as the following. void decrease_key(struct FibHeap h, struct node x, Key k){ struct node p = x!parent; x!key = k; if(p k p!key){ cut(h, x); cascading_cut(h, p); } if(k h!minTr!key) h!minTr = x; } void cut(struct FibHeap h, struct node x){ struct node p = x!parent; p!children = remove_node(p!children, x); p!degree--; h!roots = append(h!roots, x); x!parent = NULL; x!mark = 0;

1183. 10.3. FIBONACCI HEAPS 279 } void cascading_cut(struct FibHeap h, struct node x){ struct node p = x!parent; if(p){ if(!x!mark) x!mark = 1; else{ cut(h, x); cascading_cut(h, p); } } } Exercise 10.8 Prove that Decrease-Key algorithm is amortized O(1) time. 10.3.5 The name of Fibonacci Heap It's time to reveal the reason why the data structure is named as 'Fibonacci Heap'. There is only one unde

1184. ned algorithm so far, Max-Degree(N). Which can determine the upper bound of degree for any node in a N nodes Fibonacci Heap. We'll give the proof by using Fibonacci series and

1185. nally realize Max-Degree algorithm. Lemma 10.3.1. For any node x in a Fibonacci Heap, denote k = degree(x), and jxj = size(x), then jxj Fk+2 (10.18) Where Fk is Fibonacci series de

1186. ned as the following. Fk = 8 : 0 : k = 0 1 : k = 1 Fk1 + Fk2 : k 2 Proof. Consider all k children of node x, we denote them as y1; y2; :::; yk in the order of time when they were linked to x. Where y1 is the oldest, and yk is the youngest. Obviously, yi 0. When we link yi to x, children y1; y2; :::; yi1 have already been there. And algorithm LINK only links nodes with the same degree. Which indicates at that time, we have degree(yi) = degree(x) = i 1 After that, node yi can at most lost 1 child, (due to the decreasing key operation) otherwise, if it will be immediately cut o and append to root list after the second child loss. Thus we conclude degree(yi) i 2 For any i = 2; 3; :::; k. Let sk be the minimum possible size of node x, where degree(x) = k. For trivial cases, s0 = 1, s1 = 2, and we have

1187. 280CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP jxj sk = 2 + Xk i=2 sdegree(yi) 2 + Xk i=2 si2 We next show that sk Fk+2. This can be proved by induction. For trivial cases, we have s0 = 1 F2 = 1, and s1 = 2 F3 = 2. For induction case k 2. We have jxj sk 2 + Xk i=2 si2 2 + Xk i=2 Fi = 1 + Xk i=0 Fi At this point, we need prove that Fk+2 = 1 + Xk i= Fi (10.19) This can also be proved by using induction: Trivial case, F2 = 1 + F0 = 2 Induction case, Fk+2 = Fk+1 + Fk = 1 + kX1 i=0 Fi + Fk = 1 + Xk i=0 Fi Summarize all above we have the

1188. nal result. N jxj Fk + 2 (10.20)

1189. 10.4. PAIRING HEAPS 281 p 5 2 is the golden Recall the result of AVL tree, that Fk k, where = 1+ ratio. We also proved that pop operation is amortized O(lgN) algorithm. Based on this result. We can de

1190. ne Function MaxDegree as the following. MaxDegree(N) = 1 + blog Nc (10.21) The imperative Max-Degree algorithm can also be realized by using Fi-bonacci sequences. 1: function Max-Degree(N) 2: F0 0 3: F1 1 4: k 2 5: repeat 6: Fk Fk1 + Fk2 7: k k + 1 8: until Fk N 9: return k 2 Translate the algorithm to ANSI C given the following program. int max_degree(int n){ int k, F; int F2 = 0; int F1 = 1; for(F=F1+F2, k=2; Fn; ++k){ F2 = F1; F1 = F; F = F1 + F2; } return k-2; } 10.4 Pairing Heaps Although Fibonacci Heaps provide excellent performance theoretically, it is com-plex to realize. People

1191. nd that the constant behind the big-O is big. Actually, Fibonacci Heap is more signi

1192. cant in theory than in practice. In this section, we'll introduce another solution, Pairing heap, which is one of the best heaps ever known in terms of performance. Most operations including insertion,

1193. nding minimum element (top), merging are all bounds to O(1) time, while deleting minimum element (pop) is conjectured to amortized O(lgN) time [7] [6]. Note that this is still a conjecture for 15 years by the time I write this chapter. Nobody has been proven it although there are much experimental data support the O(lgN) amortized result. Besides that, pairing heap is simple. There exist both elegant imperative and functional implementations. 10.4.1 De

1194. nition Both Binomial Heaps and Fibonacci Heaps are realized with forest. While a pairing heaps is essentially a K-ary tree. The minimum element is stored at root. All other elements are stored in sub trees.

1195. 282CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP The following Haskell program de

1196. nes pairing heap. data PHeap a = E j Node a [PHeap a] This is a recursive de

1197. nition, that a pairing heap is either empty or a K-ary tree, which is consist of a root node, and a list of sub trees. Pairing heap can also be de

1198. ned in procedural languages, for example ANSI C as below. For illustration purpose, all heaps we mentioned later are minimum-heap, and we assume the type of key is integer 4. We use same linked-list based left-child, right-sibling approach (aka, binary tree representation[2]). typedef int Key; struct node{ Key key; struct node next, children, parent; }; Note that the parent

1199. eld does only make sense for decreasing key operation, which will be explained later on. we can omit it for the time being. 10.4.2 Basic heap operations In this section, we

1200. rst give the merging operation for pairing heap, which can be used to realize the insertion. Merging, insertion, and

1201. nding the minimum element are relative trivial compare to the extracting minimum element opera-tion. Merge, insert, and

1202. nd the minimum element (top) The idea of merging is similar to the linking algorithm we shown previously for Binomial heap. When we merge two pairing heaps, there are two cases. Trivial case, one heap is empty, we simply return the other heap as the result; Otherwise, we compare the root element of the two heaps, make the heap with bigger root element as a new children of the other. Let H1, and H2 denote the two heaps, x and y be the root element of H1 and H2 respectively. Function Children() returns the children of a K-ary tree. Function Node() can construct a K-ary tree from a root element and a list of children. merge(H1;H2) = 8 : H1 : H2 = H2 : H1 = Node(x; fH2g [ Children(H1)) : x y Node(y; fH1g [ Children(H2)) : otherwise (10.22) Where x = Root(H1) y = Root(H2) 4We can parametrize the key type with C++ template, but this is beyond our scope, please refer to the example programs along with this book

1203. 10.4. PAIRING HEAPS 283 It's obviously that merging algorithm is bound to O(1) time 5. The merge equation can be translated to the following Haskell program. merge :: (Ord a) ) PHeap a ! PHeap a ! PHeap a merge h E = h merge E h = h merge h1@(Node x hs1) h2@(Node y hs2) = if x y then Node x (h2:hs1) else Node y (h1:hs2) Merge can also be realized imperatively. With left-child, right sibling ap-proach, we can just link the heap, which is in fact a K-ary tree, with larger key as the

1204. rst new child of the other. This is constant time operation as described below. 1: function Merge(H1;H2) 2: if H1 = NIL then 3: return H2 4: if H2 = NIL then 5: return H1 6: if Key(H2) Key(H1) then 7: Exchange(H1 $ H2) 8: Insert H2 in front of Children(H1) 9: Parent(H2) H1 10: return H1 Note that we also update the parent

1205. eld accordingly. The ANSI C example program is given as the following. struct node merge(struct node h1, struct node h2){ if(h1 == NULL) return h2; if(h2 == NULL) return h1; if(h2!key h1!key) swap(h1, h2); h2!next = h1!children; h1!children = h2; h2!parent = h1; h1!next = NULL; =Break previous link if any= return h1; } Where function swap() is de

1206. ned in a similar way as Fibonacci Heap. With merge de

1207. ned, insertion can be realized as same as Fibonacci Heap in Equation 10.9. De

1208. nitely it's O(1) time operation. As the minimum element is always stored in root,

1209. nding it is trivial. top(H) = Root(H) (10.23) Same as the other two above operations, it's bound to O(1) time. Exercise 10.9 Implement the insertion and top operation in your favorite programming language. 5Assume [ is constant time operation, this is true for linked-list settings, including 'cons' like operation in functional programming languages.

1210. 284CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP Decrease key of a node There is another trivial operation, to decrease key of a given node, which only makes sense in imperative settings as we explained in Fibonacci Heap section. The solution is simple, that we can cut the node with the new smaller key from it's parent along with all its children. Then merge it again to the heap. The only special case is that if the given node is the root, then we can directly set the new key without doing anything else. The following algorithm describes this procedure for a given node x, with new key k. 1: function Decrease-Key(H; x; k) 2: Key(x) k 3: if Parent(x)6= NIL then 4: Remove x from Children(Parent(x)) Parent(x) NIL 5: return Merge(H; x) The following ANSI C program translates this algorithm. struct node decrease_key(struct node h, struct node x, Key key){ x!key = key; = Assume key x!key = if(x!parent) x!parent!children = remove_node(x!parent!children, x); x!parent = NULL; return merge(h, x); } Exercise 10.10 Implement the program of removing a node from the children of its parent in your favorite imperative programming language. Consider how can we ensure the overall performance of decreasing key is O(1) time? Is left-child, right sibling approach enough? Delete the minimum element from the heap (pop) Since the minimum element is always stored at root, after delete it during pop-ping, the rest things left are all sub-trees. These trees can be merged to one big tree. pop(H) = mergeP airs(Children(H)) (10.24) Pairing Heap uses a special approach that it merges every two sub-trees from left to right in pair. Then merge these paired results from right to left which forms a

1211. nal result tree. The name of `Pairing Heap' comes from the characteristic of this pair-merging. Figure 10.16 and 10.17 illustrate the procedure of pair-merging. The recursive pair-merging solution is quite similar to the bottom up merge sort[6]. Denote the children of a pairing heap as A, which is a list of trees of fT1; T2; T3; :::; Tmg for example. The mergeP airs() function can be given as

1212. 10.4. PAIRING HEAPS 285 2 5 4 3 1 2 7 1 0 1 1 6 9 1 5 1 3 8 1 7 1 4 1 6 (a) A pairing heap before pop. 5 1 5 4 1 3 3 8 1 2 7 1 0 1 1 6 9 1 7 1 4 1 6 (b) After root element 2 being removed, there are 9 sub-trees left. 4 5 1 3 3 1 5 1 2 8 7 1 0 6 1 1 9 7 1 4 1 6 (c) Merge every two trees in pair, note that there are odd number trees, so the last one needn't merge. Figure 10.16: Remove the root element, and merge children in pairs.

1213. 286CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP 6 9 1 1 7 1 4 1 6 (a) Merge tree with 9, and tree with root 6. 6 7 9 1 1 1 0 1 4 1 6 (b) Merge tree with root 7 to the result. 3 6 1 2 8 7 9 1 1 1 0 1 4 1 6 (c) Merge tree with root 3 to the result. 3 4 6 1 2 8 5 1 3 1 5 7 9 1 1 1 0 1 4 1 6 (d) Merge tree with root 4 to the result. Figure 10.17: Steps of merge from right to left.

1214. 10.4. PAIRING HEAPS 287 below. mergeP airs(A) = 8 : : A = T1 : A = fT1g merge(merge(T1; T2); mergeP airs(A0)) : otherwise (10.25) where 0 = fT3; T4; :::; Tmg A is the rest of the children without the

1215. rst two trees. The relative Haskell program of popping is given as the following. deleteMin :: (Ord a) ) PHeap a ! PHeap a deleteMin (Node _ hs) = mergePairs hs where mergePairs [] = E mergePairs [h] = h mergePairs (h1:h2:hs) = merge (merge h1 h2) (mergePairs hs) The popping operation can also be explained in the following procedural algorithm. 1: function Pop(H) 2: L NIL 3: for every 2 trees Tx, Ty 2 Children(H) from left to right do 4: Extract x, and y from Children(H) 5: T Merge(Tx; Ty) 6: Insert T at the beginning of L 7: H Children(H) . H is either NIL or one tree. 8: for 8T 2 L from left to right do 9: H Merge(H; T) 10: return H Note that L is initialized as an empty linked-list, then the algorithm iterates every two trees in pair in the children of the K-ary tree, from left to right, and performs merging, the result is inserted at the beginning of L. Because we insert to front end, so when we traverse L later on, we actually process from right to left. There may be odd number of sub-trees in H, in that case, it will leave one tree after pair-merging. We handle it by start the right to left merging from this left tree. Below is the ANSI C program to this algorithm. struct node pop(struct node h){ struct node x, y, lst = NULL; while((x = h!children) != NULL){ if((h!children = y = x!next) != NULL) h!children = h!children!next; lst = push_front(lst, merge(x, y)); } x = NULL; while((y = lst) != NULL){ lst = lst!next; x = merge(x, y); }

1216. 288CHAPTER 10. BINOMIAL HEAP, FIBONACCI HEAP, AND PAIRING HEAP free(h); return x; } The pairing heap pop operation is conjectured to be amortized O(lgN) time [7]. Exercise 10.11 Write a program to insert a tree at the beginning of a linked-list in your favorite imperative programming language. Delete a node We didn't mention delete in Binomial heap or Fibonacci Heap. Deletion can be realized by

1217. rst decreasing key to minus in

1218. nity (1), then performing pop. In this section, we present another solution for delete node. The algorithm is to de

1219. ne the function delete(H; x), where x is a node in a pairing heap H 6. If x is root, we can just perform a pop operation. Otherwise, we can cut x from H, perform a pop on x, and then merge the pop result back to H. This can be described as the following. delete(H; x) = pop(H) : x is root of H merge(cut(H; x); pop(x)) : otherwise (10.26) As delete algorithm uses pop, the performance is conjectured to be amortized O(1) time. Exercise 10.12 Write procedural pseudo code for delete algorithm. Write the delete operation in your favorite imperative programming lan-guage Consider how to realize delete in purely functional setting. 10.5 Notes and short summary In this chapter, we extend the heap implementation from binary tree to more generic approach. Binomial heap and Fibonacci heap use Forest of K-ary trees as under ground data structure, while Pairing heap use a K-ary tree to represent heap. It's a good point to post pone some expensive operation, so that the over all amortized performance is ensured. Although Fibonacci Heap gives good performance in theory, the implementation is a bit complex. It was removed in some latest textbooks. We also present pairing heap, which is easy to realize and have good performance in practice. The elementary tree based data structures are all introduced in this book. There are still many tree based data structures which we can't covers them all and skip here. We encourage the reader to refer to other textbooks about them. From next chapter, we'll introduce generic sequence data structures, array and queue. 6Here the semantic of x is a reference to a node.

1220. Bibliography [1] K-ary tree, Wikipedia. http://en.wikipedia.org/wiki/K-ary tree [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. [3] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [4] Wikipedia, Pascal's triangle. http://en.wikipedia.org/wiki/Pascal's triangle [5] Hackage. An alternate implementation of a priority queue based on a Fibonacci heap., http://hackage.haskell.org/packages/archive/pqueue-mtl/ 1.0.7/doc/html/src/Data-Queue-FibQueue.html [6] Chris Okasaki. Fibonacci Heaps. http://darcs.haskell.org/no

1221. b/gc/

1222. bheaps/orig [7] Michael L. Fredman, Robert Sedgewick, Daniel D. Sleator, and Robert E. Tarjan. The Pairing Heap: A New Form of Self-Adjusting Heap Algo-rithmica (1986) 1: 111-129. 289

1224. Part IV Queues and Sequences 291

1226. Chapter 11 Queue, not so simple as it was thought 11.1 Introduction It seems that queues are relative simple. A queue provides FIFO (

1227. rst-in,

1228. rst-out) data manipulation support. There are many options to realize queue in-cludes singly linked-list, doubly linked-list, circular buer etc. However, we'll show that it's not so easy to realize queue in purely functional settings if it must satisfy abstract queue properties. In this chapter, we'll present several dierent approaches to implement queue. And in next chapter, we'll explain how to realize sequence. A queue is a FIFO data structure satis

1229. es the following performance con-straints. Element can be added to the tail of the queue in O(1) constant time; Element can be removed from the head of the queue in O(1) constant time. These two properties must be satis

1230. ed. And it's common to add some extra goals, such as dynamic memory allocation etc. Of course such abstract queue interface can be implemented with doubly-linked list trivially. But this is a overkill solution. We can even implement imperative queue with singly linked-list or plain array. However, our main question here is about how to realize a purely functional queue as well? We'll

1231. rst review the typical queue solution which is realized by singly linked-list and circular buer in

1232. rst section; Then we give a simple and straightforward functional solution in the second section. While the performance is ensured in terms of amortized constant time, we need

1233. nd real-time solution (or worst-case solution) for some special case. Such solution will be described in the third and the fourth section. Finally, we'll show a very simple real-time queue which depends on lazy evaluation. Most of the functional contents are based on Chris, Okasaki's great work in [6]. There are more than 16 dierent types of purely functional queue given in that material. 293

1234. 294 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT 11.2 Queue by linked-list and circular buer 11.2.1 Singly linked-list solution Queue can be implemented with singly linked-list. It's easy to add and remove element at the front end of a linked-list in O(1) time. However, in order to keep the FIFO order, if we execute one operation on head, we must perform the inverse operation on tail. For plain singly linked-list, we must traverse the whole list before adding or removing. Traversing is bound to O(N) time, where N is the length of the list. This doesn't match the abstract queue properties. The solution is to use an extra record to store the tail of the linked-list. A sentinel is often used to simplify the boundary handling. The following ANSI C 1 code de

1235. nes a queue realized by singly linked-list. typedef int Key; struct Node{ Key key; struct Node next; }; struct Queue{ struct Node head, tail; }; Figure 11.1 illustrates an empty list. Both head and tail point to the sentinel NIL node. head tail S Figure 11.1: The empty queue, both head and tail point to sentinel node. We summarize the abstract queue interface as the following. function Empty . Create an empty queue function Empty?(Q) . Test if Q is empty function Enqueue(Q; x) . Add a new element x to queue Q function Dequeue(Q) . Remove element from queue Q function Head(Q) . get the next element in queue Q in FIFO order 1It's possible to parameterize the type of the key with C++ template. ANSI C is used here for illustration purpose.

1236. 11.2. QUEUE BY LINKED-LIST AND CIRCULAR BUFFER 295 Note the dierence between Dequeue and Head. Head only retrieve next element in FIFO order without removing it, while Dequeue performs removing. In some programming languages, such as Haskell, and most object-oriented languages, the above abstract queue interface can be ensured by some de

1237. nition. For example, the following Haskell code speci

1238. es the abstract queue. class Queue q where empty :: q a isEmpty :: q a ! Bool push :: q a ! a ! q a -- aka 'snoc' or append, or push_back pop :: q a ! q a -- aka 'tail' or pop_front front :: q a ! a -- aka 'head' To ensure the constant time Enqueue and Dequeue, we add new element to head and remove element from tail.2 function Enqueue(Q; x) p Create-New-Node Key(p) x Next(p) NIL Next(Tail(Q)) p Tail(Q) p Note that, as we use the sentinel node, there are at least one node, the sentinel in the queue. That's why we needn't check the validation of of the tail before we append the new created node p to it. function Dequeue(Q) x Head(Q) Next(Head(Q)) Next(x) if x = Tail(Q) then . Q gets empty Tail(Q) Head(Q) return Key(x) As we always put the sentinel node in front of all the other nodes, function Head actually returns the next node to the sentinel. Figure 11.2 illustrates Enqueue and Dequeue process with sentinel node. Translating the pseudo code to ANSI C program yields the below code. struct Queue enqueue(struct Queue q, Key x){ struct Node p = (struct Node)malloc(sizeof(struct Node)); p!key = x; p!next = NULL; q!tail!next = p; q!tail = p; return q; } Key dequeue(struct Queue q){ struct Node p = head(q); =gets the node next to sentinel= Key x = key(p); q!head!next = p!next; if(q!tail == p) 2It's possible to add new element to the tail, while remove element from head, but the operations are more complex than this approach.

1239. 296 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT x NIL Enqueue head tail Sentinel a ... e NIL (a) Before Enqueue x to queue head tail Sentinel a ... e x NIL (b) After Enqueue x to queue head tail Sentinel a b Dequeue ... e NIL (c) Before Dequeue x to queue head tail Sentinel b ... e NIL (d) After Dequeue x to queue Figure 11.2: Enqueue and Dequeue to linked-list queue.

1240. 11.2. QUEUE BY LINKED-LIST AND CIRCULAR BUFFER 297 q!tail = q!head; free(p); return x; } This solution is simple and robust. It's easy to extend this solution even to the concurrent environment (e.g. multicores). We can assign a lock to the head and use another lock to the tail. The sentinel helps us from being dead-locked due to the empty case [1] [2]. Exercise 11.1 Realize the Empty? and Head algorithms for linked-list queue. Implement the singly linked-list queue in your favorite imperative pro-gramming language. Note that you need provide functions to initialize and destroy the queue. 11.2.2 Circular buer solution Another typical solution to realize queue is to use plain array as a circular buer (also known as ring buer). Oppose to linked-list, array support appending to the tail in constant O(1) time if there are still spaces. Of course we need re-allocate spaces if the array is fully occupied. However, Array performs poor in O(N) time when removing element from head and packing the space. This is because we need shift all rest elements one cell ahead. The idea of circular buer is to reuse the free cells before the

1241. rst valid element after we remove elements from head. The idea of circular buer can be described in

1242. gure 11.3 and 11.4. If we set a maximum size of the buer instead of dynamically allocate mem-ories, the queue can be de

1243. ned with the below ANSI C code. struct Queue{ Key buf; int head, tail, size; }; When initialize the queue, we are explicitly asked to provide the maximum size as argument. struct Queue createQ(int max){ struct Queue q = (struct Queue)malloc(sizeof(struct Queue)); q!buf = (Key)malloc(sizeof(Key)max); q!size = max; q!head = q!tail = 0; return q; } To test if a queue is empty is trivial. function Empty?(Q) return Head(Q) = Tail(Q) One brute-force implementation for Enqueue and Dequeue is to calculate the modular of index blindly as the following. function Enqueue(Q; x)

1244. 298 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT head tail boundery a[0] a[1] ... a[i] ... (a) Continuously add some elements. head tail boundery ... a[j] ... a[i] ... (b) After remove some elements from head, there are free cells. head tail boundery ... a[j] ... a[i] (c) Go on adding elements till the boundary of the array. tail h e a d bounde ry a[0] ... a[j] ... (d) The next element is added to the

1245. rst free cell on head. tail h e a d bounde ry a[0] a[1] ... a[j-1] a[j] ... (e) All cells are occupied. The queue is full. Figure 11.3: A queue is realized with ring buer.

1246. 11.2. QUEUE BY LINKED-LIST AND CIRCULAR BUFFER 299 Figure 11.4: The circular buer. if : Full?(Q) then Tail(Q) (Tail(Q) + 1) mod Size(Q) Buffer(Q)[Tail(Q)] x function Head(Q) if : Empty?(Q) then return Buffer(Q)[Head(Q)] function Dequeue(Q) if : Empty?(Q) then Head(Q) (Head(Q) + 1) mod Size(Q) However, modular is expensive and slow depends on some settings, so one may replace it by some adjustment. For example as in the below ANSI C program. void enQ(struct Queue q, Key x){ if(!fullQ(q)){ q!buf[q!tail++] = x; q!tail -= q!tail q!size ? 0 : q!size; } } Key headQ(struct Queue q){ return q!buf[q!head]; = Assume queue isn't empty = } Key deQ(struct Queue q){ Key x = headQ(q); q!head++; q!head -= q!head q!size ? 0 : q!size; return x; }

1247. 300 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT Exercise 11.2 As the circular buer is allocated with a maximum size parameter, please write a function to test if a queue is full to avoid over ow. Note there are two cases, one is that the head is in front of the tail, the other is on the contrary. 11.3 Purely functional solution 11.3.1 Paired-list queue We can't just use a list to implement queue, or we can't satisfy abstract queue properties. This is because singly linked-list, which is the back-end data struc-ture in most functional settings, performs well on head in constant O(1) time, while it performs in linear O(N) time on tail, where N is the length of the list. Either dequeue or enqueue will perform proportion to the number of elements stored in the list as shown in

1248. gure 11.5. EnQueue O(1) x[N] x[N-1] ... x[2] x[1] NIL DeQueue O(N) (a) DeQueue performs poorly. EnQueue O(N) x[N] x[N-1] ... x[2] x[1] NIL DeQueue O(1) (b) EnQueue performs poorly. Figure 11.5: DeQueue and EnQueue can't perform both in constant O(1) time with a list. We neither can add a pointer to record the tail position of the list as what we have done in the imperative settings like in the ANSI C program, because of the nature of purely functional. Chris Okasaki mentioned a simple and straightforward functional solution in [6]. The idea is to maintain two linked-lists as a queue, and concatenate these two lists in a tail-to-tail manner. The shape of the queue looks like a horseshoe magnet as shown in

1249. gure 11.6. With this setup, we push new element to the head of the rear list, which is ensure to be O(1) constant time; on the other hand, we pop element from the head of the front list, which is also O(1) constant time. So that the abstract queue properties can be satis

1250. ed. The de

1251. nition of such paired-list queue can be expressed in the following Haskell code. type Queue a = ([a], [a]) empty = ([], []) Suppose function front(Q) and rear(Q) return the front and rear list in such setup, and Queue(F;R) create a paired-list queue from two lists F and R. The EnQueue (push) and DeQueue (pop) operations can be easily realized based on this setup. push(Q; x) = Queue(front(Q); fxg [ rear(Q)) (11.1)

1252. 11.3. PURELY FUNCTIONAL SOLUTION 301 (a) a horseshoe magnet. front DeQueue O(1) x[N] x[N-1] y[M] ... x[2] x[1] NIL EnQueue O(1) y[M-1] y[1] NIL rear ... y[2] (b) concatenate two lists tail-to-tail. Figure 11.6: A queue with front and rear list shapes like a horseshoe magnet.

1253. 302 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT pop(Q) = Queue(tail(front(Q)); rear(Q)) (11.2) where if a list X = fx1; x2; :::; xng, function tail(X) = fx2; x3; :::; xng returns the rest of the list without the

1254. rst element. However, we must next solve the problem that after several pop operations, the front list becomes empty, while there are still elements in rear list. One method is to rebuild the queue by reversing the rear list, and use it to replace front list. Hence a balance operation will be execute after popping. Let's denote the front and rear list of a queue Q as F = front(Q), and R = fear(Q). balance(F;R) = Queue(reverse(R); ) : F = Q : otherwise (11.3) Thus if front list isn't empty, we do nothing, while when the front list be-comes empty, we use the reversed rear list as the new front list, and the new rear list is empty. The new enqueue and dequeue algorithms are updated as below. push(Q; x) = balance(F; fxg [ R) (11.4) pop(Q) = balance(tail(F);R) (11.5) Sum up the above algorithms and translate them to Haskell yields the fol-lowing program. balance :: Queue a ! Queue a balance ([], r) = (reverse r, []) balance q = q push :: Queue a ! a ! Queue a push (f, r) x = balance (f, x:r) pop :: Queue a ! Queue a pop ([], _) = error Empty pop (_:f, r) = balance (f, r) However, although we only touch the heads of front list and rear list, the overall performance can't be kept always as O(1). Actually, the performance of this algorithm is amortized O(1). This is because the reverse operation takes time proportion to the length of the rear list. it's bound O(N) time, where N = jRj. We left the prove of amortized performance as an exercise to the reader. 11.3.2 Paired-array queue - a symmetric implementation There is an interesting implementation which is symmetric to the paired-list queue. In some old programming languages, such as legacy version of BASIC, There is array supported, but there is no pointers, nor records to represent linked-list. Although we can use another array to store indexes so that we

1255. 11.3. PURELY FUNCTIONAL SOLUTION 303 can represent linked-list with implicit array, there is another option to realized amortized O(1) queue. Compare the performance of array and linked-list. Below table reveals some facts (Suppose both contain N elements). operation Array Linked-list insert on head O(N) O(1) insert on tail O(1) O(N) remove on head O(N) O(1) remove on tail O(1) O(N) Note that linked-list performs in constant time on head, but in linear time on tail; while array performs in constant time on tail (suppose there is enough memory spaces, and omit the memory reallocation for simpli

1256. cation), but in linear time on head. This is because we need do shifting when prepare or eliminate an empty cell in array. (see chapter 'the evolution of insertion sort' for detail.) The above table shows an interesting characteristic, that we can exploit it and provide a solution mimic to the paired-list queue: We concatenate two arrays, head-to-head, to make a horseshoe shape queue like in

1257. gure 11.7. (a) a horseshoe magnet. EnQueue O(1) DeQueue O(1) front array x[1] x[2] ... x[N-1] x[N] y[1] y[2] ... y[M-1] y[M] rear array (b) concatenate two arrays head-to-head. Figure 11.7: A queue with front and rear arrays shapes like a horseshoe magnet. We can de

1258. ne such paired-array queue like the following Python code 3 class Queue: def __init__(self): self.front = [] self.rear = [] def is_empty(q): return q.front == [] and q.rear == [] The relative Push() and Pop() algorithm only manipulate on the tail of the arrays. function Push(Q; x) 3Legacy Basic code is not presented here. And we actually use list but not array in Python to illustrate the idea. ANSI C and ISO C++ programs are provides along with this chapter, they show more in a purely array manner.

1259. 304 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT Append(Rear(Q), x) Here we assume that the Append() algorithm append element x to the end of the array, and handle the necessary memory allocation etc. Actually, there are multiple memory handling approaches. For example, besides the dynamic re-allocation, we can initialize the array with enough space, and just report error if it's full. function Pop(Q) if Front(Q) = then Front(Q) Reverse(Rear(Q)) Rear(Q) N Length(Front(Q)) x Front(Q)[N] Length(Front(Q)) N 1 return x For simpli

1260. cation and pure illustration purpose, the array isn't shrunk ex-plicitly after elements removed. So test if front array is empty () can be realized as check if the length of the array is zero. We omit all these details here. The enqueue and dequeue algorithms can be translated to Python programs straightforwardly. def push(q, x): q.rear.append(x) def pop(q): if q.front == []: q.rear.reverse() (q.front, q.rear) = (q.rear, []) return q.front.pop() Similar to the paired-list queue, the performance is amortized O(1) because the reverse procedure takes linear time. Exercise 11.3 Prove that the amortized performance of paired-list queue is O(1). Prove that the amortized performance of paired-array queue is O(1). 11.4 A small improvement, Balanced Queue Although paired-list queue is amortized O(1) for popping and pushing, the so-lution we proposed in previous section performs poor in the worst case. For example, there is one element in the front list, and we push N elements con-tinuously to the queue, here N is a big number. After that executing a pop operation will cause the worst case. According to the strategy we used so far, all the N elements are added to the rear list. The front list turns to be empty after a pop operation. So the algorithm starts to reverse the rear list. This reversing procedure is bound to O(N) time, which is proportion to the length of the rear list. Sometimes, it can't be acceptable for a very big N.

1261. 11.4. A SMALL IMPROVEMENT, BALANCED QUEUE 305 The reason why this worst case happens is because the front and rear lists are extremely unbalanced. We can improve our paired-list queue design by making them more balanced. One option is to add a balancing constraint. jRj jFj (11.6) Where R = Rear(Q), F = Front(Q), and jLj is the length of list L. This constraint ensure the length of the rear list is less than the length of the front list. So that the reverse procedure will be executed once the rear list grows longer than the front list. Here we need frequently access the length information of a list. However, calculate the length takes linear time for singly linked-list. We can record the length to a variable and update it as adding and removing elements. This approach enables us to get the length information in constant time. Below example shows the modi

1262. ed paired-list queue de

1263. nition which is aug-mented with length

1264. elds. data BalanceQueue a = BQ [a] Int [a] Int As we keep the invariant as speci

1265. ed in (11.6), we can easily tell if a queue is empty by testing the length of the front list. F = , jFj = 0 (11.7) In the rest part of this section, we suppose the length of a list L, can be retrieved as jLj in constant time. Push and pop are almost as same as before except that we check the balance invariant by passing length information and performs reversing accordingly. push(Q; x) = balance(F; jFj; fxg [ R; jRj + 1) (11.8) pop(Q) = balance(tail(F); jFj 1;R; jRj) (11.9) Where function balance() is de

1266. ned as the following. balance(F; jFj;R; jRj) = Queue(F; jFj;R; jRj) : jRj jFj Queue(F [ reverse(R); jFj + jRj;; 0) : otherwise (11.10) Note that the function Queue() takes four parameters, the front list along with its length (recorded), and the rear list along with its length, and forms a paired-list queue augmented with length

1267. elds. We can easily translate the equations to Haskell program. And we can enforce the abstract queue interface by making the implementation an instance of the Queue type class. instance Queue BalanceQueue where empty = BQ [] 0 [] 0 isEmpty (BQ _ lenf _ _) = lenf == 0 -- Amortized O(1) time push push (BQ f lenf r lenr) x = balance f lenf (x:r) (lenr + 1)

1268. 306 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT -- Amortized O(1) time pop pop (BQ (_:f) lenf r lenr) = balance f (lenf - 1) r lenr front (BQ (x:_) _ _ _) = x balance f lenf r lenr j lenr lenf = BQ f lenf r lenr j otherwise = BQ (f ++ (reverse r)) (lenf + lenr) [] 0 Exercise 11.4 Write the symmetric balance improvement solution for paired-array queue in your favorite imperative programming language. 11.5 One more step improvement, Real-time Queue Although the extremely worst case can be avoided by improving the balancing as what has been presented in previous section, the performance of reversing rear list is still bound to O(N), where N = jRj. So if the rear list is very long, the instant performance is still unacceptable poor even if the amortized time is O(1). It is particularly important in some real-time system to ensure the worst case performance. As we have analyzed, the bottleneck is the computation of F [ reverse(R). This happens when jRj jFj. Considering that jFj and jRj are all integers, so this computation happens when jRj = jFj + 1 (11.11) Both F and the result of reverse(R) are singly linked-list, It takes O(jFj) time to concatenate them together, and it takes extra O(jRj) time to reverse the rear list, so the total computation is bound to O(jNj), where N = jFj+jRj. Which is proportion to the total number of elements in the queue. In order to realize a real-time queue, we can't computing F [ reverse(R) monolithic. Our strategy is to distribute this expensive computation to every pop and push operations. Thus although each pop and push get a bit slow, we may avoid the extremely slow worst pop or push case. Incremental reverse Let's examine how functional reverse algorithm is implemented typically. reverse(X) = : X = reverse(X0) [ fx1g : otherwise (11.12) Where X0 = tail(X) = fx2; x3; :::g. This is a typical recursive algorithm, that if the list to be reversed is empty, the result is just an empty list. This is the edge case; otherwise, we take the

1269. rst element x1 from the list, reverse the rest fx2; x3; :::; xng, to fxn; xn1; ::; x3; x2g and append x1 after it. However, this algorithm performs poor, as appending an element to the end of a list is proportion to the length of the list. So it's O(N2), but not a linear time reverse algorithm.

1270. 11.5. ONE MORE STEP IMPROVEMENT, REAL-TIME QUEUE 307 There exists another implementation which utilizes an accumulator A, like below. 0 (X; ) (11.13) reverse(X) = reverse Where 0 (X;A) = reverse A : X = reverse0(X0; fx1g [ A) : otherwise (11.14) We call A as accumulator because it accumulates intermediate reverse result at any time. Every time we call reverse0(X;A), list X contains the rest of elements wait to be reversed, and A holds all the reversed elements so far. For instance when we call reverse0() at i-th time, X and A contains the following elements: X = fxi; xi+1; :::; xng A = fxi1; xi2; :::x1g In every non-trivial case, we takes the

1271. rst element from X in O(1) time; then put it in front of the accumulator A, which is again O(1) constant time. We repeat it N times, so this is a linear time (O(N)) algorithm. The latter version of reverse is obviously a tail-recursion algorithm, see [5] and [6] for detail. Such characteristic is easy to change from monolithic algo-rithm to incremental manner. The solution is state transferring. We can use a state machine contains two types of stat: reversing state Sr to indicate that the reverse is still on-going (not

1272. nished), and

1273. nish state Sf to indicate the reverse has been done (

1274. nished). In Haskell programming language, it can be de

1275. ned as a type. data State a = j Reverse [a] [a] j Done [a] And we can schedule (slow-down) the above reverse0(X;A) function with these two types of state. step(S;X;A) = (Sf ;A) : S = Sr ^ X = (Sr;X0; fx1g [ A) : S = Sr ^ X6= (11.15) Each step, we examine the state type

1276. rst, if the current state is Sr (on-going), and the rest elements to be reversed in X is empty, we can turn the algorithm to

1277. nish state Sf ; otherwise, we take the

1278. rst element from X, put it in front of A just as same as above, but we do NOT perform recursion, instead, we just

1279. nish this step. We can store the current state as well as the resulted X and A, the reverse can be continued at any time when we call 'next' step function in the future with the stored state, X and A passed in. Here is an example of this step-by-step reverse algorithm. step(Sr; hello; ) = (Sr; ello; h) step(Sr; ello; h) = (Sr; llo; eh) ::: step(Sr; o; lleh) = (Sr;; olleh) step(Sr;; olleh) = (Sf ; olleh) And in Haskell code manner, the example is like the following.

1280. 308 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT step $ Reverse hello [] = Reverse ello h step $ Reverse ello h = Reverse llo eh ... step $ Reverse o lleh = Reverse [] olleh step $ Reverse [] olleh = Done olleh Now we can distribute the reverse into steps in every pop and push op-erations. However, the problem is just half solved. We want to break down F [ reverse(R), and we have broken reverse(R) into steps, we next need to schedule(slow-down) the list concatenation part F[:::, which is bound to O(jFj), into incremental manner so that we can distribute it to pop and push operations. Incremental concatenate It's a bit more challenge to implement incremental list concatenation than list reversing. However, it's possible to re-use the result we gained from increment reverse by a small trick: In order to realize X [ Y , we can

1281. rst reverse X to X, then take elements one by one from X and put them in front of Y just as what we have done in reverse0. X [ Y reverse(reverse(X)) [ Y reverse0(reverse(X); ) [ Y reverse0( reverse(X); Y ) reverse0( X; Y ) (11.16) This fact indicates us that we can use an extra state to instruct the step() function to continuously concatenating F after R is reversed. The strategy is to do the total work in two phases: 1. Reverse both F and R in parallel to get F = reverse(F), and R = reverse(R) incrementally; 2. Incrementally take elements from F and put them in front of R. So we de

1282. ne three types of state: Sr represents reversing; Sc represents concatenating; and Sf represents

1283. nish. In Haskell, these types of state are de

1284. ned as the following. data State a = Reverse [a] [a] [a] [a] j Concat [a] [a] j Done [a] Because we reverse F and R simultaneously, so reversing state takes two pairs of lists and accumulators. The state transfering is de

1285. ned according to the two phases strategy de-scribed previously. Denotes that F = ff1; f2; :::g, F0 = tail(F) = ff2; f3; :::g, R = fr1; r2; :::g, R0 = tail(R) = fr2; r3; :::g. A state S, contains it's type S, which has the value among , , and . Note that S also contains necessary SrScSf parameters such as F, F , X, A etc as intermediate results. These parameters

1286. 11.5. ONE MORE STEP IMPROVEMENT, REAL-TIME QUEUE 309 vary according to the dierent states. next(S) = 8 : (Sr; F0; ff1g [ F ;R0; fr1g [ R) : S = Sr ^ F6= ^ R6= (Sc; F ; fr1g [ R) : S = Sr ^ F = ^ R = fr1g (Sf ;A) : S = Sc ^ X = (Sc;X0; fx1g [ A) : S = Sc ^ X6= (11.17) The relative Haskell program is list as below. next (Reverse (x:f) f' (y:r) r') = Reverse f (x:f') r (y:r') next (Reverse [] f' [y] r') = Concat f' (y:r') next (Concat 0 _ acc) = Done acc next (Concat (x:f') acc) = Concat f' (x:acc) All left to us is to distribute these incremental steps into every pop and push operations to implement a real-time O(1) purely functional queue. Sum up Before we dive into the

1287. nal real-time queue implementation. Let's analyze how many incremental steps are taken to achieve the result of F [ reverse(R). According to the balance variant we used previously, jRj = jFj+1, Let's denotes M = jFj. Once the queue gets unbalanced due to some push or pop operation, we start this incremental F [ reverse(R). It needs M +1 steps to reverse R, and at the same time, we

1288. nish reversing the list F within these steps. After that, we need extra M + 1 steps to execute the concatenation. So there are 2M + 2 steps. It seems that distribute one step inside one pop or push operation is the natural solution, However, there is a critical question must be answered: Is it possible that before we

1289. nish these 2M + 2 steps, the queue gets unbalanced again due to a series push and pop? There are two facts about this question, one is good news and the other is bad news. Let's

1290. rst show the good news, that luckily, continuously pushing can't make the queue unbalanced again before we

1291. nish these 2M + 2 steps to achieve F [ reverse(R). This is because once we start re-balancing, we can get a new front list F0 = F [reverse(R) after 2M+2 steps. While the next time unbalance is triggered when jR0j = jF0j + 1 = jFj + jRj + 1 = 2M + 2 (11.18) That is to say, even we continuously pushing as mush elements as possible after the last unbalanced time, when the queue gets unbalanced again, the 2M + 2 steps exactly get

1292. nished at that time point. Which means the new front list F0 is calculated OK. We can safely go on to compute F0 [reverse(R0). Thanks to the balance invariant which is designed in previous section. But, the bad news is that, pop operation can happen at anytime before these 2M +2 steps

1293. nish. The situation is that once we want to extract element from front list, the new front list F0 = F [ reverse(R) hasn't been ready yet. We don't have a valid front list at hand.

1294. 310 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT ~R front copy on-going computation new rear ffi; fi+1; :::; fMg (Sr; F; ~ :::; ; :::) f:::g

1295. rst i 1 elements popped intermediate result F and R new elements pushed Table 11.1: Intermediate state of a queue before

1296. rst M steps

1297. nish. One solution to solve this problem is to keep a copy of original front list F, during the time we are calculating reverse(F) which is described in phase 1 of our incremental computing strategy. So that we are still safe even if user continuously performs

1298. rst M pop operations. So the queue looks like in table 11.1 at some time after we start the incremental computation and before phase 1 (reverse F and R simultaneously) ending4. After these M pop operations, the copy of F is exhausted. And we just start incremental concatenation phase at that time. What if user goes on popping? The fact is that since F is exhausted (becomes ), we needn't do concate-nation at all. Since F [ R = [ R = R. It indicates us, when doing concatenation, we only need to concatenate those elements haven't been popped, which are still left in F. As user pops elements one by one continuously from the head of front list F, one method is to use a counter, record how many elements there are still in F. The counter is initialized as 0 when we start computing F [ reverse(R), it's increased by one when we reverse one element in F, which means we need concatenate thie element in the future; and it's decreased by one every time when pop is performed, which means we can concatenate one element less; of course we need decrease this counter as well in every steps of concatenation. If and only if this counter becomes zero, we needn't do concatenations any more. We can give the realization of purely functional real-time queue according to the above analysis. We

1299. rst add an idle state S0 to simplify some state transfering. Below Haskell program is an example of this modi

1300. ed state de

1301. nition. data State a = Empty j Reverse Int [a] [a] [a] [a] -- n, f', acc_f' r, acc_r j Append Int [a] [a] -- n, rev_f', acc j Done [a] -- result: f ++ reverse r And the data structure is de

1302. ned with three parts, the front list (augmented with length); the on-going state of computing F [ reverse(R); and the rear list (augmented with length). Here is the Haskell de

1303. nition of real-time queue. data RealtimeQueue a = RTQ [a] Int (State a) [a] Int The empty queue is composed with empty front and rear list together with idle state S0 as Queue(; 0; S0;; 0). And we can test if a queue is empty by 4One may wonder that copying a list takes linear time to the length of the list. If so the whole solution would make no sense. Actually, this linear time copying won't happen at all. This is because the purely functional nature, the front list won't be mutated either by popping or by reversing. However, if trying to realize a symmetric solution with paired- array and mutate the array in-place, this issue should be stated, and we can perform a `lazy' copying, that the real copying work won't execute immediately, instead, it copies one element every step we do incremental reversing. The detailed implementation is left as an exercise.

1304. 11.5. ONE MORE STEP IMPROVEMENT, REAL-TIME QUEUE 311 checking if jFj = 0 according to the balance invariant de

1305. ned before. Push and pop are changed accordingly. push(Q; x) = balance(F; jFj; S; fxg [ R; jRj + 1) (11.19) pop(Q) = balance(F 0 ; jFj 1; abort(S);R; jRj) (11.20) The major dierence is abort() function. Based on our above analysis, when there is popping, we need decrease the counter, so that we can concatenate one element less. We de

1306. ne this as aborting. The details will be given after balance() function. The relative Haskell code for push and pop are listed like this. push (RTQ f lenf s r lenr) x = balance f lenf s (x:r) (lenr + 1) pop (RTQ (_:f) lenf s r lenr) = balance f (lenf - 1) (abort s) r lenr The balance() function

1307. rst check the balance invariant, if it's violated, we need start re-balance it by starting compute F [ reverse(R) incrementally; otherwise we just execute one step of the un

1308. nished incremental computation. balance(F; jFj; S;R; jRj) = step(F; jFj; S;R; jRj) : jRj jFj step(F; jFj + jRj; (Sr; 0; F;;R; ); 0) : otherwise (11.21) The relative Haskell code is given like below. balance f lenf s r lenr j lenr lenf = step f lenf s r lenr j otherwise = step f (lenf + lenr) (Reverse 0 f [] r []) [] 0 The step() function typically transfer the state machine one state ahead, and it will turn the state to idle (S0) when the incremental computation

1309. nishes. step(F; jFj; S;R; jRj) = Queue(F0; jFj; S0;R; jRj) : S0 = Sf Queue(F; jFj; S0;R; jRj) : otherwise (11.22) Where S0 = next(S) is the next state transferred; F0 = F [ reverse(R), is the

1310. nal new front list result from the incremental computing. The real state transferring is implemented in next() function as the following. It's dierent from previous version by adding the counter

1311. eld n to record how many elements left we need to concatenate. next(S) = 8 : (Sr; n + 1; F0; ff1g [ F ;R0; fr1g [ R) : S = Sr ^ F6= (Sc; n; F ; fr1g [ R) : S = Sr ^ F = (Sf ;A) : S = Sc ^ n = 0 (Sc; n 1;X0; fx1g [ A) : S = Sc ^ n6= 0 S : otherwise (11.23) And the corresponding Haskell code is like this.

1312. 312 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT next (Reverse n (x:f) f' (y:r) r') = Reverse (n+1) f (x:f') r (y:r') next (Reverse n [] f' [y] r') = Concat n f' (y:r') next (Concat 0 _ acc) = Done acc next (Concat n (x:f') acc) = Concat (n-1) f' (x:acc) next s = s Function abort() is used to tell the state machine, we can concatenate one element less since it is popped. abort(S) = 8 : (Sf ;A0) : S = Sc ^ n = 0 (Sc; n 1;X0A) : S = Sc ^ n6= 0 (Sr; n 1; F; F ;R; R) : S = Sr S : otherwise (11.24) Note that when n = 0 we actually rollback one concatenated element by return A0 as the result but not A. (Why? this is left as an exercise.) The Haskell code for abort function is like the following. abort (Concat 0 _ (_:acc)) = Done acc -- Note! we rollback 1 elem abort (Concat n f' acc) = Concat (n-1) f' acc abort (Reverse n f f' r r') = Reverse (n-1) f f' r r' abort s = s It seems that we've done, however, there is still one tricky issue hidden behind us. If we push an element x to an empty queue, the result queue will be: Queue(; 1; (Sc; 0;; fxg);; 0) If we perform pop immediately, we'll get an error! We found that the front list is empty although the previous computation of F [ reverse(R) has been

1313. nished. This is because it takes one more extra step to transfer from the state (Sc; 0;;A) to (Sf ;A). It's necessary to re

1314. ne the S0 in step() function a bit. S0 = next(next(S)) : F = next(S) : otherwise (11.25) The modi

1315. cation re ects to the below Haskell code: step f lenf s r lenr = case s' of Done f' ! RTQ f' lenf Empty r lenr s' ! RTQ f lenf s' r lenr where s' = if null f then next $ next s else next s Note that this algorithm diers from the one given by Chris Okasaki in [6]. Okasaki's algorithm executes two steps per pop and push, while the one presents in this chapter executes only one per pop and push, which leads to more distributed performance. Exercise 11.5 Why need we rollback one element when n = 0 in abort() function? Realize the real-time queue with symmetric paired-array queue solution in your favorite imperative programming language.

1316. 11.6. LAZY REAL-TIME QUEUE 313 In the footnote, we mentioned that when we start incremental reversing with in-place paired-array solution, copying the array can't be done mono-lithic or it will lead to linear time operation. Implement the lazy copying so that we copy one element per step along with the reversing. 11.6 Lazy real-time queue The key to realize a real-time queue is to break down the expensive F [ reverse(R) to avoid monolithic computation. Lazy evaluation is particularly helpful in such case. In this section, we'll explore if there is some more elegant solution by exploit laziness. Suppose that there exits a function rotate(), which can compute F[reverse(R) incrementally. that's to say, with some accumulator A, the following two func-tions are equivalent. rotate(X; Y;A) X [ reverse(Y ) [ A (11.26) Where we initialized X as the front list F, Y as the rear list R, and the accumulator A is initialized as empty . The trigger of rotation is still as same as before when jFj + 1 = jRj. Let's keep this constraint as an invariant during the whole rotation process, that jXj + 1 = jY j always holds. It's obvious to deduce to the trivial case: rotate(; fy1g;A) = fy1g [ A (11.27) Denote X = fx1; x2; :::g, Y = fy1; y2; :::g, and X0 = fx2; x3; :::g, Y 0 = fy2; y3; :::g are the rest of the lists without the

1317. rst element for X and Y re-spectively. The recursion case is ruled out as the following. rotate(X; Y;A) X [ reverse(Y ) [ A De

1318. nition of (11:26) fx1g [ (X0 [ reverse(Y ) [ A) Associative of [ fx1g [ (X0 [ reverse(Y 0) [ (fy1g [ A)) Nature of reverse and associative of [ fx1g [ rotate(X0; Y 0; fy1g [ A) De

1319. nition of (11:26) (11.28) Summarize the above two cases, yields the

1320. nal incremental rotate algorithm. rotate(X; Y;A) = fy1g [ A : X = fx1g [ rotate(X0; Y 0; fy1g [ A) : otherwise (11.29) If we execute [ lazily instead of strictly, that is, execute [ once pop or push operation is performed, the computation of rotate can be distribute to push and pop naturally. Based on this idea, we modify the paired-list queue de

1321. nition to change the front list to a lazy list, and augment it with a computation stream. [5]. When the queue triggers re-balance constraint by some pop/push, that jFj + 1 = jRj, The algorithm creates a lazy rotation computation, then use this lazy rotation as the new front list F0; the new rear list becomes , and a copy of F0 is maintained as a stream.

1322. 314 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT After that, when we performs every push and pop; we consume the stream by forcing a [ operation. This results us advancing one step along the stream, fxg [ F00, where F0 = tail(F0). We can discard x, and replace the stream F0 with F00. Once all of the stream is exhausted, we can start another rotation. In order to illustrate this idea clearly, we turns to Scheme/Lisp programming language to show example codes, because it gives us explicit control of laziness. In Scheme/Lisp, we have the following three tools to deal with lazy stream. (define (cons-stream a b) (cons a (delay b))) (define stream-car car) (define (stream-cdr s) (cdr (force s))) So 'cons-stream' constructs a 'lazy' list from an element x and an existing list L without really evaluating the value of L; The evaluation is actually delayed to 'stream-cdr', where the computation is forced. delaying can be realized by lambda calculus, please refer to [5] for detail. The lazy paired-list queue is de

1323. ned as the following. (define (make-queue f r s) (list f r s)) ;; Auxiliary functions (define (front-lst q) (car q)) (define (rear-lst q) (cadr q)) (define (rots q) (caddr q)) A queue is consist of three parts, a front list, a rear list, and a stream which represents the computation of F [reverse(R). Create an empty queue is trivial as making all these three parts null. (define empty (make-queue '() '() '())) Note that the front-list is also lazy stream actually, so we need use stream related functions to manipulate it. For example, the following function test if the queue is empty by checking the front lazy list stream. (define (empty? q) (stream-null? (front-lst q))) The push function is almost as same as the one given in previous section. That we put the new element in front of the rear list; and then examine the balance invariant and do necessary balancing works. push(Q; x) = balance(F; fxg [ R;Rs) (11.30) Where R represents the lazy stream of front list; Rs is the stream of rotation computation. The relative Scheme/Lisp code is give below. (define (push q x) (balance (front-lst q) (cons x (rear q)) (rots q)))

1324. 11.6. LAZY REAL-TIME QUEUE 315 While pop is a bit dierent, because the front list is actually lazy stream, we need force an evaluation. All the others are as same as before. pop(Q) = balance(F0 ;R;Rs) (11.31) Here F0, force one evaluation to F, the Scheme/Lisp code regarding to this equation is as the following. (define (pop q) (balance (stream-cdr (front-lst q)) (rear q) (rots q))) For illustration purpose, we skip the error handling (such as pop from an empty queue etc) here. And one can access the top element in the queue by extract from the front list stream. (define (front q) (stream-car (front-lst q))) The balance function

1325. rst checks if the computation stream is completely exhausted, and starts new rotation accordingly; otherwise, it just consumes one evaluation by enforcing the lazy stream. balance(Q) = Queue(F0;;F0) : Rs = Queue(F;R;R0 s) : otherwise (11.32) Here F0 is de

1326. ned to start a new rotation. F0 = rotate(F;R; ) (11.33) The relative Scheme/Lisp program is listed accordingly. (define (balance f r s) (if (stream-null? s) (let ((newf (rotate f r '()))) (make-queue newf '() newf)) (make-queue f r (stream-cdr s)))) The implementation of incremental rotate function is just as same as what we analyzed above. (define (rotate xs ys acc) (if (stream-null? xs) (cons-stream (car ys) acc) (cons-stream (stream-car xs) (rotate (stream-cdr xs) (cdr ys) (cons-stream (car ys) acc))))) We used explicit lazy evaluation in Scheme/Lisp. Actually, this program can be very short by using lazy programming languages, for example, Haskell. data LazyRTQueue a = LQ [a] [a] [a] -- front, rear, f ++ reverse r instance Queue LazyRTQueue where empty = LQ [] [] [] isEmpty (LQ f _ _) = null f -- O(1) time push

1327. 316 CHAPTER 11. QUEUE, NOT SO SIMPLE AS IT WAS THOUGHT push (LQ f r rot) x = balance f (x:r) rot -- O(1) time pop pop (LQ (_:f) r rot) = balance f r rot front (LQ (x:_) _ _) = x balance f r [] = let f' = rotate f r [] in LQ f' [] f' balance f r (_:rot) = LQ f r rot rotate [] [y] acc = y:acc rotate (x:xs) (y:ys) acc = x : rotate xs ys (y:acc) 11.7 Notes and short summary Just as mentioned in the beginning of this book in the

1328. rst chapter, queue isn't so simple as it was thought. We've tries to explain algorithms and data structures both in imperative and in function approaches; Sometimes, it gives impression that functional way is simpler and more expressive in most time. However, there are still plenty of areas, that more studies and works are needed to give equivalent functional solution. Queue is such an important topic, that it links to many fundamental purely functional data structures. That's why Chris Okasaki made intensively study and took a great amount of discussions in [6]. With purely functional queue solved, we can easily implement dequeue with the similar approach revealed in this chapter. As we can handle elements eectively in both head and tail, we can advance one step ahead to realize sequence data structures, which support fast concatenate, and

1329. nally we can realize random access data structures to mimic array in imperative settings. The details will be explained in later chapters. Note that, although we haven't mentioned priority queue, it's quite possible to realized it with heaps. We have covered topic of heaps in several previous chapters. Exercise 11.6 Realize dequeue, wich support adding and removing elements on both sides in constant O(1) time in purely functional way. Realize dequeue in a symmetric solution only with array in your favorite imperative programming language.

1330. Bibliography [1] Maged M. Michael and Michael L. Scott. Simple, Fast, and Prac-tical Non-Blocking and Blocking Concurrent Queue Algorithms. http://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html [2] Herb Sutter. Writing a Generalized Concurrent Queue. Dr. Dobb's Oct 29, 2008. http://drdobbs.com/cpp/211601363?pgno=1 [3] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. [4] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [5] Wikipedia. Tail-call. http://en.wikipedia.org/wiki/Tail call [6] Wikipedia. Recursion (computer science). http://en.wikipedia.org/wiki/Recursion (computer science)#Tail-recursive functions [7] Harold Abelson, Gerald Jay Sussman, Julie Sussman. Structure and In-terpretation of Computer Programs, 2nd Edition. MIT Press, 1996, ISBN 0-262-51087-1 317

1331. 318 Sequences

1332. Chapter 12 Sequences, The last brick 12.1 Introduction In the

1333. rst chapter of this book, which introduced binary search tree as the `hello world' data structure, we mentioned that neither queue nor array is simple if realized not only in imperative way, but also in functional approach. In previous chapter, we explained functional queue, which achieves the similar performance as its imperative counterpart. In this chapter, we'll dive into the topic of array-like data structures. We have introduced several data structures in this book so far, and it seems that functional approaches typically bring more expressive and elegant solu-tion. However, there are some areas, people haven't found competitive purely functional solutions which can match the imperative ones. For instance, the Ukkonen linear time sux tree construction algorithm. another examples is Hashing table. Array is also among them. Array is trivial in imperative settings, it enables randomly accessing any elements with index in constant O(1) time. However, this performance target can't be achieved directly in purely functional settings as there is only list can be used. In this chapter, we are going to abstract the concept of array to sequences. Which support the following features Element can be inserted to or removed from the head of the sequence quickly in O(1) time; Element can be inserted to or removed from the head of the sequence quickly in O(1) time; Support concatenate two sequences quickly (faster than linear time); Support randomly access and update any element quickly; Support split at any position quickly; We call these features abstract sequence properties, and it easy to see the fact that even array (here means plain-array) in imperative settings can't meet them all at the same time. 319

1334. 320 CHAPTER 12. SEQUENCES, THE LAST BRICK We'll provide three solutions in this chapter. Firstly, we'll introduce a so-lution based on binary tree forest and numeric representation; Secondly, we'll show a concatenate-able list solution; Finally, we'll give the

1335. nger tree solution. Most of the results are based on Chris, Okasaki's work in [6]. 12.2 Binary random access list 12.2.1 Review of plain-array and list Let's review the performance of plain-array and singly linked-list so that we know how they perform in dierent cases. operation Array Linked-list operation on head O(N) O(1) operation on tail O(1) O(N) access at random position O(1) average O(N) remove at given position average O(N) O(1) concatenate O(N2) O(N1) Because we hold the head of linked list, operations on head such as insert and remove perform in constant time; while we need traverse to the end to perform removing or appending on tail; Given a position i, it need traverse i elements to access it. Once we are at that position, removing element from there is just bound to constant time by modifying some pointers. In order to concatenate two linked-lists, we need traverse to the end of the

1336. rst one, and link it to the second one, which is bound to the length of the

1337. rst linked-list; On the other hand, for array, we must prepare free cell for inserting a new element to the head of it, and we need release the

1338. rst cell after the

1339. rst element being removed, all these two operations are achieved by shifting all the rest elements forward or backward, which costs linear time. While the operations on the tail of array are trivial constant time. Array also support accessing random position i by nature; However, removing the element at that position causes shifting all elements after it one position ahead. In order to concatenate two arrays, we need copy all elements from the second one to the end of the

1340. rst one (ignore the memory re-allocation details), which is proportion to the length of the second array. In the chapter about binomial heaps, we have explained the idea of using forest, which is a list of trees. It brings us the merit that, for any given number N, by representing it in binary number, we know how many binomial trees need to hold them. That each bit of 1 represents a binomial tree of that rank of bit. We can go one step ahead, if we have a N nodes binomial heap, for any given index 1 i N, we can quickly know which binomial tree in the heap holds the i-th node. 12.2.2 Represent sequence by trees One solution to realize a random-access sequence is to manage the sequence with a forest of complete binary trees. Figure 12.1 shows how we attach such trees to a sequence of numbers. Here two trees t1 and t2 are used to represent sequence fx1; x2; x3; x4; x5; x6g. The size of binary tree t1 is 2. The

1341. rst two elements fx1; x2g are leaves of t1;

1342. 12.2. BINARY RANDOM ACCESS LIST 321 t 1 x1 x2 t 2 x3 x4 x5 x6 Figure 12.1: A sequence of 6 elements can be represented in a forest. the size of binary tree t2 is 4. The next four elements fx3; x4; x5; x6g are leaves of t2. For a complete binary tree, we de

1343. ne the depth as 0 if the tree has only a leaf. The tree is denoted as as ti if its depth is i+1. It's obvious that there are 2i leaves in ti. For any sequence contains N elements, it can be turned to a forest of com-plete binary trees in this manner. First we represent N in binary number like below. N = 20e0 + 21e1 + ::: + 2MeM (12.1) Where ei is either 1 or 0, so N = (eMeM1:::e1e0)2. If ei6= 0, we then need a complete binary tree with size 2i, For example in

1344. gure 12.1, as the length of sequence is 6, which is (110)2 in binary. The lowest bit is 0, so we needn't a tree of size 1; the second bit is 1, so we need a tree of size 2, which has depth of 2; the highest bit is also 1, thus we need a tree of size 4, which has depth of 3. This method represents the sequence fx1; x2; :::; xNg to a list of trees ft0; t1; :::; tMg where ti is either empty if ei = 0 or a complete binary tree if ei = 1. We call this representation as Binary Random Access List [6]. We can reused the de

1345. nition of binary tree. For example, the following Haskell program de

1346. nes the tree and the binary random access list. data Tree a = Leaf a j Node Int (Tree a) (Tree a) -- size, left, right type BRAList a = [Tree a] The only dierence from the typical binary tree is that we augment the size information to the tree. This enable us to get the size without calculation at every time. For instance.

1347. 322 CHAPTER 12. SEQUENCES, THE LAST BRICK size (Leaf _) = 1 size (Node sz _ _) = sz 12.2.3 Insertion to the head of the sequence The new forest representation of sequence enables many operation eectively. For example, the operation of inserting a new element y in front of sequence can be realized as the following. 1. Create a tree t0, with y as the only one leaf; 2. Examine the

1348. rst tree in the forest, compare its size with t0, if its size is greater than t0, we just let t0 be the new head of the forest, since the forest is a linked-list of tree, insert t0 to its head is trivial operation, which is bound to constant O(1) time; 3. Otherwise, if the size of

1349. rst tree in the forest is equal to t0, let's denote this tree in the forest as ti, we can construct a new binary tree t0 i+1 by linking ti and t0 as its left and right children. After that, we recursively try to insert t0 i+1 to the forest. Figure 12.2 and 12.3 illustrate the steps of inserting element x1; x2; :::; x6 to an empty tree. x1 (a) A singleton leaf of x1 t 1 x2 x1 (b) Insert x2. It causes linking, results a tree of height 1. x3 x2 t 1 x1 (c) Insert x3. the result is two trees, t1 and t2 t 2 x4 x3 x2 x1 (d) Insert x4. It

1350. rst causes linking two leafs to a binary tree, then it performs linking again, which results a

1351. nal tree of height 2. Figure 12.2: Steps of inserting elements to an empty list, 1 As there are at most M trees in the forest, and M is bound to O(lgN), so the insertion to head algorithm is ensured to perform in O(lgN) even in worst case. We'll prove the amortized performance is O(1) later.

1352. 12.2. BINARY RANDOM ACCESS LIST 323 x5 t 2 x4 x3 x2 x1 (a) Insert x5. The forest is a leaf (t0) and t2. t 1 x6 x5 t 2 x4 x3 x2 x1 (b) Insert x6. It links two leaf to t1. Figure 12.3: Steps of inserting elements to an empty list, 2 Let's formalize the algorithm to equations. we de

1353. ne the function of inserting an element in front of a sequence as insert(S; x). insert(S; x) = insertT ree(S; leaf(x)) (12.2) This function just wrap element x to a singleton tree with a leaf, and call insertT ree to insert this tree to the forest. Suppose the forest F = ft1; t2; :::g if it's not empty, and F0 = ft2; t3; :::g is the rest of trees without the

1354. rst one. insertT ree(F; t) = 8 : ftg : F = ftg [ F : size(t) size(t1) insertT ree(F0; link(t; t1)) : otherwise (12.3) Where function link(t1; t2) create a new tree from two small trees with same size. Suppose function tree(s; t1; t2) create a tree, set its size as s, makes t1 as the left child, and t2 as the right child, linking can be realized as below. link(t1; t2) = tree(size(t1) + size(t2); t1; t2) (12.4) The relative Haskell programs can be given by translating these equations. cons :: a ! BRAList a ! BRAList a cons x ts = insertTree ts (Leaf x) insertTree :: BRAList a ! Tree a ! BRAList a insertTree [] t = [t] insertTree (t':ts) t = if size t size t' then t:t':ts else insertTree ts (link t t') -- Precondition: rank t1 = rank t2 link :: Tree a ! Tree a ! Tree a link t1 t2 = Node (size t1 + size t2) t1 t2 Here we use the Lisp tradition to name the function that insert an element before a list as `cons'.

1355. 324 CHAPTER 12. SEQUENCES, THE LAST BRICK Remove the element from the head of the sequence It's not complex to realize the inverse operation of `cons', which can remove element from the head of the sequence. If the

1356. rst tree in the forest is a singleton leaf, remove this tree from the forest; otherwise, we can halve the

1357. rst tree by unlinking its two children, so the

1358. rst tree in the forest becomes two trees, we recursively halve the

1359. rst tree until it turns to be a leaf. Figure 12.4 illustrates the steps of removing elements from the head of the sequence. x5 t 2 x4 x3 x2 x1 (a) A sequence of 5 elements t 2 x4 x3 x2 x1 (b) Result of removing x5, the leaf is removed. x3 x2 t 1 x1 (c) Result of removing x4, As there is not leaf tree, the tree is

1360. rstly divided into two sub trees of size 2. The

1361. rst tree is next divided again into two leafs, after that, the

1362. rst leaf, which contains x4 is removed. What left in the forest is a leaf tree of x3, and a tree of size 2 with elements x2; x1. Figure 12.4: Steps of removing elements from head If we assume the sequence isn't empty, so that we can skip the error han-dling such as trying to remove an element from an empty sequence, this can be expressed with the following equation. We denote the forest F = ft1; t2; :::g and the trees without the

1363. rst one as F0 = ft2; t3; :::g extractT ree(F) = (t1; F0) : t1 is leaf extractT ree(ftl; trg [ F0) : otherwise (12.5) where ftl; trg = unlink(t1) are the two children of t1. It can be translated to Haskell programs like below.

1364. 12.2. BINARY RANDOM ACCESS LIST 325 extractTree (t@(Leaf x):ts) = (t, ts) extractTree (t@(Node _ t1 t2):ts) = extractTree (t1:t2:ts) With this function de

1365. ned, it's convenient to give head() and tail() functions, the former returns the

1366. rst element in the sequence, the latter return the rest. head(S) = key(first(extractT ree(S))) (12.6) tail(S) = second(extractT ree(S)) (12.7) Where function first() returns the

1367. rst element in a paired-value (as known as tuple); second() returns the second element respectively. function key() is used to access the element inside a leaf. Below are Haskell programs correspond-ing to these two equations. head' ts = x where (Leaf x, _) = extractTree ts tail' = snd extractTree Note that as head and tail functions have already been de

1368. ned in Haskell standard library, we given them apostrophes to make them distinct. (another option is to hide the standard ones by importing. We skip the details as they are language speci

1369. c). Random access the element in binary random access list As trees in the forest help managing the elements in blocks, giving an arbitrary index, it's easy to locate which tree this element is stored, after that performing a search in the tree yields the result. As all trees are binary (more accurate, complete binary tree), the search is essentially binary search, which is bound to the logarithm of the tree size. This brings us a faster random access capability than linear search in linked-list setting. Given an index i, and a sequence S, which is actually a forest of trees, the algorithm is executed as the following 1. 1. Compare i with the size of the

1370. rst tree T1 in the forest, if i is less than or equal to the size, the element exists in T1, perform looking up in T1; 2. Otherwise, decrease i by the size of T1, and repeat the previous step in the rest of the trees in the forest. This algorithm can be represented as the below equation. get(S; i) = lookupTree(T1; i) : i jT1j get(S0; i jT1j) : otherwise (12.8) Where jTj = size(T), and S0 = fT2; T3; :::g is the rest of trees without the

1371. rst one in the forest. Note that we don't handle out of bound error case, this is left as an exercise to the reader. Function lookupT ree() is just a binary search algorithm, if the index i is 1, we just return the root of the tree, otherwise, we halve the tree by unlinking, if 1We follow the tradition that the index i starts from 1 in algorithm description; while it starts from 0 in most programming languages

1372. 326 CHAPTER 12. SEQUENCES, THE LAST BRICK i is less than or equal to the size of the halved tree, we recursively look up the left tree, otherwise, we look up the right tree. lookupT ree(T; i) = 8 : root(T) : i = 1 lookupT ree(left(T)) : i b jTj 2 c lookupT ree(right(T)) : otherwise (12.9) Where function left() returns the left tree Tl of T, while right() returns Tr. The corresponding Haskell program is given as below. getAt (t:ts) i = if i size t then lookupTree t i else getAt ts (i - size t) lookupTree (Leaf x) 0 = x lookupTree (Node sz t1 t2) i = if i sz `div` 2 then lookupTree t1 i else lookupTree t2 (i - sz `div` 2) Figure 12.5 illustrates the steps of looking up the 4-th element in a sequence of size 6. It

1373. rst examine the

1374. rst tree, since the size is 2 which is smaller than 4, so it goes on looking up for the second tree with the updated index i0 = 42, which is the 2nd element in the rest of the forest. As the size of the next tree is 4, which is greater than 2, so the element to be searched should be located in this tree. It then examines the left sub tree since the new index 2 is not greater than the half size 4/2=2; The process next visits the right grand-child, and the

1375. nal result is returned. t 1 x6 x5 t 2 x4 x3 x2 x1 (a) getAt(S; 4)), 4 size(t1) = 2 t 2 x4 x3 x2 x1 (b) getAt(S0; 4 2) ) lookupT ree(t2; 2) left(t2) x4 x3 (c) 2 bsize(t2)=2c ) lookupT ree(left(t2); 2) x3 (d) lookupT ree(right(left(t2)); 1), x3 is returned. Figure 12.5: Steps of locating the 4-th element in a sequence. By using the similar idea, we can update element at any arbitrary position i. We

1376. rst compare the size of the

1377. rst tree T1 in the forest with i, if it is less

1378. 12.2. BINARY RANDOM ACCESS LIST 327 than i, it means the element to be updated doesn't exist in the

1379. rst tree. We recursively examine the next tree in the forest, comparing it with ijT1j, where jT1j represents the size of the

1380. rst tree. Otherwise if this size is greater than or equal to i, the element is in the tree, we halve the tree recursively until to get a leaf, at this stage, we can replace the element of this leaf with a new one. set(S; i; x) = fupdateT ree(T1; i; x)g [ S0 : i jT1j fT1g [ set(S0; i jT1j; x) : otherwise (12.10) Where S0 = fT2; T3; :::g is the rest of the trees in the forest without the

1381. rst one. Function setT ree(T; i; x) performs a tree search and replace the i-th element with the given value x. setT ree(T; i; x) = 8 : leaf(x) : i = 0 ^ jTj = 1 tree(jTj; setT ree(Tl; i; x); Tr) : i b jTj 2 c tree(jTj; Tl; setT ree(Tr; i b jTj 2 c; x)) : otherwise (12.11) Where Tl and Tr are left and right sub tree of T respectively. The following Haskell program translates the equation accordingly. setAt :: BRAList a ! Int ! a ! BRAList a setAt (t:ts) i x = if i size t then (updateTree t i x):ts else t:setAt ts (i-size t) x updateTree :: Tree a ! Int ! a ! Tree a updateTree (Leaf _) 0 x = Leaf x updateTree (Node sz t1 t2) i x = if i sz `div` 2 then Node sz (updateTree t1 i x) t2 else Node sz t1 (updateTree t2 (i - sz `div` 2) x) As the nature of complete binary search tree, for a sequence with N elements, which is represented by binary random access list, the number of trees in the forest is bound to O(lgN). Thus it takes O(lgN) time to locate the tree for arbitrary index i, that contains the element. the followed tree search is bound to the heights of the tree, which is O(lgN) as well. So the total performance of random access is O(lgN). Exercise 12.1 1. The random access algorithm given in this section doesn't handle the error such as out of bound index at all. Modify the algorithm to handle this case, and implement it in your favorite programming language. 2. It's quite possible to realize the binary random access list in imperative settings, which is bene

1382. ted with fast operation on the head of the se-quence. the random access can be realized in two steps:

1383. rstly locate the tree, secondly use the capability of constant random access of array. Write a program to implement it in your favorite imperative programming language.

1384. 328 CHAPTER 12. SEQUENCES, THE LAST BRICK 12.3 Numeric representation for binary random access list In previous section, we mentioned that for any sequence with N elements, we can represent N in binary format so that N = 20e0 +21e1 +:::+2MeM. Where ei is the i-th bit, which can be either 0 or 1. If ei6= 0 it means that there is a complete binary tree with size 2i. This fact indicates us that there is an explicit relationship between the binary form of N and the forest. Insertion a new element on the head can be simulated by increasing the binary number by one; while remove an element from the head mimics the decreasing of the corresponding binary number by one. This is as known as numeric representation [6]. In order to represent the binary access list with binary number, we can de

1385. ne two states for a bit. That Zero means there is no such a tree with size which is corresponding to the bit, while One, means such tree exists in the forest. And we can attach the tree with the state if it is One. The following Haskell program for instance de

1386. nes such states. data Digit a = Zero j One (Tree a) type RAList a = [Digit a] Here we reuse the de

1387. nition of complete binary tree and attach it to the state One. Note that we cache the size information in the tree as well. With digit de

1388. ned, forest can be treated as a list of digits. Let's see how inserting a new element can be realized as binary number increasing. Suppose function one(t) creates a One state and attaches tree t to it. And function getT ree(s) get the tree which is attached to the One state s. The sequence S is a list of digits of states that S = fs1; s2; :::g, and S0 is the rest of digits with the

1389. rst one removed. insertT ree(S; t) = 8 : fone(t)g : S = fone(t)g [ S0 : s1 = Zero fZerog [ insertT ree(S0; link(t; getT ree(s1))) : otherwise (12.12) When we insert a new tree t to a forest S of binary digits, If the forest is empty, we just create a One state, attach the tree to it, and make this state the only digit of the binary number. This is just like 0 + 1 = 1; Otherwise if the forest isn't empty, we need examine the

1390. rst digit of the binary number. If the

1391. rst digit is Zero, we just create a One state, attach the tree, and replace the Zero state with the new created One state. This is just like (:::digits:::0)2+1 = (:::digits:::1)2. For example 6+1 = (110)2+1 = (111)2 = 7. The last case is that the

1392. rst digit is One, here we make assumption that the tree t to be inserted has the same size with the tree attached to this One state at this stage. This can be ensured by calling this function from inserting a leaf, so that the size of the tree to be inserted grows in a series of 1; 2; 4; :::; 2i; :::. In such case, we need link these two trees (one is t, the other is the tree attached to the One state), and recursively insert the linked result to the rest of the digits. Note that the previous One state has to be replaced with a Zero state. This is just

1393. 12.3. NUMERIC REPRESENTATION FOR BINARY RANDOM ACCESS LIST329 like (:::digits:::1)2+1 = (:::digits0:::0)2, where (:::digits0:::)2 = (:::digits:::)2+1. For example 7 + 1 = (111)2 + 1 = (1000)2 = 8 Translating this algorithm to Haskell yields the following program. insertTree :: RAList a ! Tree a ! RAList a insertTree [] t = [One t] insertTree (Zero:ts) t = One t : ts insertTree (One t' :ts) t = Zero : insertTree ts (link t t') All the other functions, including link(); cons() etc. are as same as before. Next let's see how removing an element from a sequence can be represented as binary number deduction. If the sequence is a singleton One state attached with a leaf. After removal, it becomes empty. This is just like 1 1 = 0; Otherwise, we examine the

1394. rst digit, if it is One state, it will be replaced with a Zero state to indicate that this tree will be no longer exist in the forest as it being removed. This is just like (:::digits:::1)2 1 = (:::digits:::0)2. For example 7 1 = (111)2 1 = (110)2 = 6; If the

1395. rst digit in the sequence is a Zero state, we have to borrow from the further digits for removal. We recursively extract a tree from the rest digits, and halve the extracted tree to its two children. Then the Zero state will be replaced with a One state attached with the right children, and the left children is removed. This is something like (:::digits:::0)2 1 = (:::digits0:::1)2, where (:::digits0:::)2 = (:::digits)21. For example 41 = (100)21 = (11)2 = 3.The following equation illustrated this algorithm. extractT ree(S) = 8 : (t; ) : S = fone(t)g (t; S0) : s1 = one(t) (tl; fone(tr)g [ S00 : otherwise (12.13) Where (t0; S00) = extractT ree(S0), tl and tr are left and right sub-trees of t0. All other functions, including head(), tail() are as same as before. Numeric representation doesn't change the performance of binary random access list, readers can refer to [2] for detailed discussion. Let's take for example, analyze the average performance (or amortized) of insertion on head algorithm by using aggregation analysis. Considering the process of inserting N = 2m elements to an empty binary random access list. The numeric representation of the forest can be listed as the following. i forest (MSB ... LSB) 0 0, 0, ..., 0, 0 1 0, 0, ..., 0, 1 2 0, 0, ..., 1, 0 3 0, 0, ..., 1, 1 ... ... 2m 1 1, 1, ..., 1, 1 2m 1, 0, 0, ..., 0, 0 bits changed 1, 1, 2, ... 2m1. 2m The LSB of the forest changed every time when there is a new element inserted, it costs 2m units of computation; The next bit changes every two times due to a linking operation, so it costs 2m1 units; the bit next to MSB of the forest changed only one time which links all previous trees to a big tree as

1396. 330 CHAPTER 12. SEQUENCES, THE LAST BRICK the only one in the forest. This happens at the half time of the total insertion process, and after the last element is inserted, the MSB ips to 1. Sum these costs up yield to the total cost T = 1+1+2+3+:::+2m1 +2m = 2m+1 So the average cost for one insertion is O(T=N) = O( 2m+1 2m ) = O(1) (12.14) Which proves that the insertion algorithm performs in amortized O(1) con-stant time. The proof for deletion are left as an exercise to the reader. 12.3.1 Imperative binary access list It's trivial to implement the binary access list by using binary trees, and the recursion can be eliminated by updating the focused tree in loops. This is left as an exercise to the reader. In this section, we'll show some dierent imperative implementation by using the properties of numeric representation. Remind the chapter about binary heap. Binary heap can be represented by implicit array. We can use similar approach that use an array of 1 element to represent the leaf; use an array of 2 elements to represent a binary tree of height 1; and use an array of 2m to represent a complete binary tree of height m. This brings us the capability of accessing any element with index directly instead of divide and conquer tree search. However, the tree linking operation has to be implemented as array copying as the expense. The following ANSI C code de

1397. nes such a forest. #define M sizeof(int) 8 typedef int Key; struct List { int n; Key tree[M]; }; Where n is the number of the elements stored in this forest. Of course we can avoid limiting the max number of trees by using dynamic arrays, for example as the following ISO C++ code. templatetypename Key struct List { int n; vectorvectorkey tree; }; For illustration purpose only, we use ANSI C here, readers can

1398. nd the complete ISO C++ example programs along with this book. Let's review the insertion process, if the

1399. rst tree is empty (a Zero digit), we simply set the

1400. rst tree as a leaf of the new element to be inserted; otherwise, the insertion will cause tree linking anyway, and such linking may be recursive until it reach a position (digit) that the corresponding tree is empty. The numeric representation reveals an important fact that if the

1401. rst, second, ..., (i 1)-th trees all exist, and the i-th tree is empty, the result is creating a tree of size 2i, and all the elements together with the new element to be inserted are stored in

1402. 12.3. NUMERIC REPRESENTATION FOR BINARY RANDOM ACCESS LIST331 this new created tree. What's more, all trees after position i are kept as same as before. Is there any good methods to locate this i position? As we can use binary number to represent the forest of N element, after a new element is inserted, N increases to N + 1. Compare the binary form of N and N + 1, we

1403. nd that all bits before i change from 1 to 0, the i-th bit ip from 0 to 1, and all the bits after i keep unchanged. So we can use bit-wise exclusive or () to detect this bit. Here is the algorithm. function Number-Of-Bits(N) i 0 while bN 2 c6= 0 do N bN 2 c i i + 1 return i i Number-Of-Bits(N (N + 1)) And it can be easily implemented with bit shifting, for example the below ANSI C code. int nbits(int n) { int i=0; while(n = 1) ++i; return i; } So the imperative insertion algorithm can be realized by

1404. rst locating the bit which ip from 0 to 1, then creating a new array of size 2i to represent a complete binary tree, and moving content of all trees before this bit to this array as well as the new element to be inserted. function Insert(L; x) i Number-Of-Bits(N (N + 1)) Tree(L)[i + 1] Create-Array(2i) l 1 Tree(L)[i + 1][l] x for j 2 [1; i] do for k 2 [1; 2j ] do l l + 1 Tree(L)[i + 1][l] Tree(L)[j][k] Tree(L)[j] NIL Size(L) Size(L) + 1 return L The corresponding ANSI C program is given as the following. struct List insert(struct List a, Key x) { int i, j, sz; Key xs; i = nbits( (a.n+1) ^ a.n ); xs = a.tree[i] = (Key)malloc(sizeof(Key)(1i)); for(j=0, xs++ = x, sz = 1; ji; ++j, sz 1) { memcpy((void)xs, (void)a.tree[j], sizeof(Key)(sz));

1405. 332 CHAPTER 12. SEQUENCES, THE LAST BRICK xs += sz; free(a.tree[j]); a.tree[j] = NULL; } ++a.n; return a; } However, the performance in theory isn't as good as before. This is be-cause the linking operation downgrade from O(1) constant time to linear array copying. We can again calculate the average (amortized) performance by using ag-gregation analysis. When insert N = 2m elements to an empty list which is represented by implicit binary trees in arrays, the numeric presentation of the forest of arrays are as same as before except for the cost of bit ipping. i forest (MSB ... LSB) 0 0, 0, ..., 0, 0 1 0, 0, ..., 0, 1 2 0, 0, ..., 1, 0 3 0, 0, ..., 1, 1 ... ... 2m 1 1, 1, ..., 1, 1 2m 1, 0, 0, ..., 0, 0 bit change cost 1 2m, 1 2m1, 2 2m2, ... 2m2 2, 2m1 1 The LSB of the forest changed every time when there is a new element inserted, however, it creates leaf tree and performs copying only it changes from 0 to 1, so the cost is half of N unit, which is 2m1; The next bit ips as half as the LSB. Each time the bit gets ipped to 1, it copies the

1406. rst tree as well as the new element to the second tree. the the cost of ipping a bit to 1 in this bit is 2 units, but not 1; For the MSB, it only ips to 1 at the last time, but the cost of ipping this bit, is copying all the previous trees to

1407. ll the array of size 2m. Summing all to cost and distributing them to the N times of insertion yields the amortized performance as below. O(T=N) = O( 12m+12m1+22m2+:::+2m11 2m ) = O(1 + m 2 ) = O(m) (12.15) As m = O(lgN), so the amortized performance downgrade from constant time to logarithm, although it is still faster than the normal array insertion which is O(N) in average. The random accessing gets a bit faster because we can use array indexing instead of tree search. function Get(L; i) for each t 2 Trees(L) do if t6= NIL then if i Size(t) then return t[i] else i i Size(t)

1408. 12.4. IMPERATIVE PAIRED-ARRAY LIST 333 Here we skip the error handling such as out of bound indexing etc. The ANSI C program of this algorithm is like the following. Key get(struct List a, int i) { int j, sz; for(j = 0, sz = 1; j M; ++j, sz 1) if(a.tree[j]) { if(i sz) break; i -= sz; } return a.tree[j][i]; } The imperative removal and random mutating algorithms are left as exercises to the reader. Exercise 12.2 1. Please implement the random access algorithms, including looking up and updating, for binary random access list with numeric representation in your favorite programming language. 2. Prove that the amortized performance of deletion is O(1) constant time by using aggregation analysis. 3. Design and implement the binary random access list by implicit array in your favorite imperative programming language. 12.4 Imperative paired-array list 12.4.1 De

1409. nition In previous chapter about queue, a symmetric solution of paired-array is pre-sented. It is capable to operate on both ends of the list. Because the nature that array supports fast random access. It can be also used to realize a fast random access sequence in imperative setting. x[n] ... x[2] x[1] y[1] y[2] ... y[m] Figure 12.6: A paired-array list, which is consist of 2 arrays linking in head-head manner. Figure 12.6 shows the design of paired-array list. Tow arrays are linked in head-head manner. To insert a new element on the head of the sequence, the element is appended at the end of front list; To append a new element on the tail of the sequence, the element is appended at the end of rear list; Here is a ISO C++ code snippet to de

1410. ne the this data structure.

1411. 334 CHAPTER 12. SEQUENCES, THE LAST BRICK templatetypename Key struct List { int n, m; vectorKey front; vectorKey rear; List() : n(0), m(0) {} int size() { return n + m; } }; Here we use vector provides in standard library to cover the dynamic memory management issues, so that we can concentrate on the algorithm design. 12.4.2 Insertion and appending Suppose function Front(L) returns the front array, while Rear(L) returns the rear array. For illustration purpose, we assume the arrays are dynamic allocated. inserting and appending can be realized as the following. function Insert(L; x) F Front(L) Size(F) Size(F) + 1 F[Size(F)] x function Append(L; x) R Rear(L) Size(R) Size(R) + 1 R[Size(R)] x As all the above operations manipulate the front and rear array on tail, they are all constant O(1) time. And the following are the corresponding ISO C++ programs. templatetypename Key void insert(ListKey xs, Key x) { ++xs.n; xs.front.push_back(x); } templatetypename Key void append(ListKey xs, Key x) { ++xs.m; xs.rear.push_back(x); } 12.4.3 random access As the inner data structure is array (dynamic array as vector), which supports random access by nature, it's trivial to implement constant time indexing algo-rithm. function Get(L; i) F Front(L) N Size(F)

1412. 12.4. IMPERATIVE PAIRED-ARRAY LIST 335 if i N then return F[N i + 1] else Rear(L)[i N] Here the index i 2 [1; jLj] starts from 1. If it is not greater than the size of front array, the element is stored in front. However, as front and rear arrays are connect head-to-head, so the elements in front array are in reverse order. We need locate the element by subtracting the size of front array by i; If the index i is greater than the size of front array, the element is stored in rear array. Since elements are stored in normal order in rear, we just need subtract the index i by an oset which is the size of front array. Here is the ISO C++ program implements this algorithm. templatetypename Key Key get(ListKey xs, int i) { if( i xs.n ) return xs.front[xs.n-i-1]; else return xs.rear[i-xs.n]; } The random mutating algorithm is left as an exercise to the reader. 12.4.4 removing and balancing Removing isn't as simple as insertion and appending. This is because we must handle the condition that one array (either front or rear) becomes empty due to removal, while the other still contains elements. In extreme case, the list turns to be quite unbalanced. So we must

1413. x it to resume the balance. One idea is to trigger this

1414. xing when either front or rear array becomes empty. We just cut the other array in half, and reverse the

1415. rst half to form the new pair. The algorithm is described as the following. function Balance(L) F Front(L), R Rear(L) N Size(F), M Size(R) if F = then F Reverse(R[1 ... bM 2 c]) R R[bM 2 c + 1:::M] else if R = then R Reverse(F[1 ... bN 2 c]) F F[bN 2 c + 1:::N ] Actually, the operations are symmetric for the case that front is empty and the case that rear is empty. Another approach is to swap the front and rear for one symmetric case and recursive resumes the balance, then swap the front and rear back. For example below ISO C++ program uses this method. templatetypename Key void balance(ListKey xs) { if(xs.n == 0) { back_insert_iteratorvectorKey i(xs.front); reverse_copy(xs.rear.begin(), xs.rear.begin() + xs.m=2, i);

1416. 336 CHAPTER 12. SEQUENCES, THE LAST BRICK xs.rear.erase(xs.rear.begin(), xs.rear.begin() +xs.m=2); xs.n = xs.m=2; xs.m -= xs.n; } else if(xs.m == 0) { swap(xs.front, xs.rear); swap(xs.n, xs.m); balance(xs); swap(xs.front, xs.rear); swap(xs.n, xs.m); } } With Balance algorithm de

1417. ned, it's trivial to implement remove algorithm both on head and on tail. function Remove-Head(L) Balance(L) F Front(L) if F = then Remove-Tail(L) else Size(F) Size(F) - 1 function Remove-Tail(L) Balance(L) R Rear(L) if R = then Remove-Head(L) else Size(R) Size(R) - 1 There is an edge case for each, that is even after balancing, the array targeted to perform removal is still empty. This happens that there is only one element stored in the paired-array list. The solution is just remove this singleton left element, and the overall list results empty. Below is the ISO C++ program implements this algorithm. templatetypename Key void remove_head(ListKey xs) { balance(xs); if(xs.front.empty()) remove_tail(xs); ==remove the singleton elem in rear else { xs.front.pop_back(); --xs.n; } } templatetypename Key void remove_tail(ListKey xs) { balance(xs); if(xs.rear.empty()) remove_head(xs); ==remove the singleton elem in front

1418. 12.5. CONCATENATE-ABLE LIST 337 else { xs.rear.pop_back(); --xs.m; } } It's obvious that the worst case performance is O(N) where N is the number of elements stored in paired-array list. This happens when balancing is triggered, and both reverse and shifting are linear operation. However, the amortized performance of removal is still O(1), the proof is left as exercise to the reader. Exercise 12.3 1. Implement the random mutating algorithm in your favorite imperative programming language. 2. We utilized vector provided in standard library to manage memory dynam-ically, try to realize a version using plain array and manage the memory allocation manually. Compare this version and consider how does this aect the performance? 3. Prove that the amortized performance of removal is O(1) for paired-array list. 12.5 Concatenate-able list By using binary random access list, we realized sequence data structure which supports O(lgN) time insertion and removal on head, as well as random access-ing element with a given index. However, it's not so easy to concatenate two lists. As both lists are forests of complete binary trees, we can't merely merge them (Since forests are essentially list of trees, and for any size, there is at most one tree of that size. Even concatenate forests directly is not fast). One solution is to push the element from the

1419. rst sequence one by one to a stack and then pop those elements and insert them to the head of the second one by using `cons' function. Of course the stack can be implicitly used in recursion manner, for instance: concat(s1; s2) = s2 : s1 = cons(head(s1); concat(tail(s1); s2)) : otherwise (12.16) Where function cons(), head() and tail() are de

1420. ned in previous section. If the length of the two sequence is N, and M, this method takes O(N lgN) time repeatedly push all elements from the

1421. rst sequence to stacks, and then takes (N lg(N + M)) to insert the elements in front of the second sequence. Note that means the upper limit, There is detailed de

1422. nition for it in [2]. We have already implemented the real-time queue in previous chapter. It supports O(1) time pop and push. If we can turn the sequence concatenation to a kind of pushing operation to queue, the performance will be improved to O(1) as well. Okasaki gave such realization in [6], which can concatenate lists in constant time.

1423. 338 CHAPTER 12. SEQUENCES, THE LAST BRICK To represent a concatenate-able list, the data structure designed by Okasaki is essentially a K-ary tree. The root of the tree stores the

1424. rst element in the list. So that we can access it in constant O(1) time. The sub-trees or children are all small concatenate-able lists, which are managed by real-time queues. Concatenating another list to the end is just adding it as the last child, which is in turn a queue pushing operation. Appending a new element can be realized as that,

1425. rst wrapping the element to a singleton tree, which is a leaf with no children. Then, concatenate this singleton to

1426. nalize appending. Figure 12.7 illustrates this data structure. x[1] c[1] c[2] ... c[n] x[2]...x[i] x[i+1]...x[j] x[k]...x[n] (a) The data structure for list fx1; x2; :::; xng x[1] c[1] c[2] ... c[n] c[n+1] x[2]...x[i] x[i+1]...x[j] x[k]...x[n] y[1]...y[m] (b) The result after concatenated with list fy1; y2; :::; ymg Figure 12.7: Data structure for concatenate-able list Such recursively designed data structure can be de

1427. ned in the following Haskell code. data CList a = Empty j CList a (Queue (CList a)) deriving (Show, Eq) It means that a concatenate-able list is either empty or a K-ary tree, which again consists of a queue of concatenate-able sub-lists and a root element. Here we reuse the realization of real-time queue mentioned in previous chapter. Suppose function clist(x;Q) constructs a concatenate-able list from an el-ement x, and a queue of sub-lists Q. While function root(s) returns the root element of such K-ary tree implemented list. and function queue(s) returns the queue of sub-lists respectively. We can implement the algorithm to concatenate

1428. 12.5. CONCATENATE-ABLE LIST 339 two lists like this. concat(s1; s2) = 8 : s1 : s2 = s2 : s1 = clist(x; push(Q; s2)) : otherwise (12.17) Where x = root(s1) and Q = queue(s1). The idea of concatenation is that if either one of the list to be concatenated is empty, the result is just the other list; otherwise, we push the second list as the last child to the queue of the

1429. rst list. Since the push operation is O(1) constant time for a well realized real-time queue, the performance of concatenation is bound to O(1). The concat() function can be translated to the below Haskell program. concat x Empty = x concat Empty y = y concat (CList x q) y = CList x (push q y) Besides the good performance of concatenation, this design also brings sat-is

1430. ed features for adding element both on head and tail. cons(x; s) = concat(clist(x; ); s) (12.18) append(s; x) = concat(s; clist(x; )) (12.19) It's a bit complex to realize the algorithm that removes the

1431. rst element from a concatenate-able list. This is because after the root, which is the

1432. rst element in the sequence got removed, we have to re-construct the rest things, a queue of sub-lists, to a K-ary tree. Before diving into the re-construction, let's solve the trivial part

1433. rst. Get-ting the

1434. rst element is just returning the root of the K-ary tree. head(s) = root(s) (12.20) As we mentioned above, after root being removed, there left all children of the K-ary tree. Note that all of them are also concatenate-able list, so that one natural solution is to concatenate them all together to a big list. concatAll(Q) = : Q = concat(front(Q); concatAll(pop(Q))) : otherwise (12.21) Where function front() just returns the

1435. rst element from a queue without removing it, while pop() does the removing work. If the queue is empty, it means that there is no children at all, so the result is also an empty list; Otherwise, we pop the

1436. rst child, which is a concatenate-able list, from the queue, and recursively concatenate all the rest children to a list;

1437. nally, we concatenate this list behind the already popped

1438. rst children. With concatAll() de

1439. ned, we can then implement the algorithm of removing the

1440. rst element from a list as below. tail(s) = linkAll(queue(s)) (12.22) The corresponding Haskell program is given like the following.

1441. 340 CHAPTER 12. SEQUENCES, THE LAST BRICK head (CList x _) = x tail (CList _ q) = linkAll q linkAll q j isEmptyQ q = Empty j otherwise = link (front q) (linkAll (pop q)) Function isEmptyQ is used to test a queue is empty, it is trivial and we omit its de

1442. nition. Readers can refer to the source code along with this book. linkAll() algorithm actually traverses the queue data structure, and reduces to a

1443. nal result. This remind us of folding mentioned in the chapter of binary search tree. readers can refer to the appendix of this book for the detailed description of folding. It's quite possible to de

1444. ne a folding algorithm for queue instead of list2 [8]. foldQ(f; e;Q) = e : Q = f(front(Q); foldQ(f; e; pop(Q))) : otherwise (12.23) Function foldQ() takes three parameters, a function f, which is used for reducing, an initial value e, and the queue Q to be traversed. Here are some examples to illustrate folding on queue. Suppose a queue Q contains elements f1; 2; 3; 4; 5g from head to tail. foldQ(+; 0;Q) = 1 + (2 + (3 + (4 + (5 + 0)))) = 15 foldQ(; 1;Q) = 1 (2 (3 (4 (5 1)))) = 120 foldQ(; 0;Q) = 1 (2 (3 (4 (5 0)))) = 0 Function linkAll can be changed by using foldQ accordingly. linkAll(Q) = foldQ(link;;Q) (12.24) The Haskell program can be modi

1445. ed as well. linkAll = foldQ link Empty foldQ :: (a ! b ! b) ! b ! Queue a ! b foldQ f z q j isEmptyQ q = z j otherwise = (front q) `f` foldQ f z (pop q) However, the performance of removing can't be ensured in all cases. The worst case is that, user keeps appending N elements to a empty list, and then immediately performs removing. At this time, the K-ary tree has the

1446. rst element stored in root. There are N 1 children, all are leaves. So linkAll() algorithm downgrades to O(N) which is linear time. The average case is amortized O(1), if the add, append, concatenate and removing operations are randomly performed. The proof is left as en exercise to the reader. Exercise 12.4 1. Can you

1447. gure out a solution to append an element to the end of a binary random access list? 2Some functional programming language, such as Haskell, de

1448. ned type class, which is a concept of monoid so that it's easy to support folding on a customized data structure.

1449. 12.6. FINGER TREE 341 2. Prove that the amortized performance of removal operation is O(1). Hint: using the banker's method. 3. Implement the concatenate-able list in your favorite imperative language. 12.6 Finger tree We haven't been able to meet all the performance targets listed at the beginning of this chapter. Binary random access list enables to insert, remove element on the head of sequence, and random access elements fast. However, it performs poor when concatenates lists. There is no good way to append element at the end of binary access list. Concatenate-able list is capable to concatenates multiple lists in a y, and it performs well for adding new element both on head and tail. However, it doesn't support randomly access element with a given index. These two examples bring us some ideas: In order to support fast manipulation both on head and tail of the se-quence, there must be some way to easily access the head and tail position; Tree like data structure helps to turn the random access into divide and conquer search, if the tree is well balance, the search can be ensured to be logarithm time. 12.6.1 De

1450. nition Finger tree[6], which was

1451. rst invented in 1977, can help to realize ecient sequence. And it is also well implemented in purely functional settings[5]. As we mentioned that the balance of the tree is critical to ensure the perfor-mance for search. One option is to use balanced tree as the under ground data structure for

1452. nger tree. For example the 2-3 tree, which is a special B-tree. (readers can refer to the chapter of B-tree of this book). A 2-3 tree either contains 2 children or 3. It can be de

1453. ned as below in Haskell. data Node a = Br2 a a j Br3 a a a In imperative settings, node can be de

1454. ned with a list of sub nodes, which contains at most 3 children. For instance the following ANSI C code de

1455. nes node. union Node { Key keys; union Node children; }; Note in this de

1456. nition, a node can either contain 2 3 keys, or 2 3 sub nodes. Where key is the type of elements stored in leaf node. We mark the left-most none-leaf node as the front

1457. nger and the right-most none-leaf node as the rear

1458. nger. Since both

1459. ngers are essentially 2-3 trees with all leafs as children, they can be directly represented as list of 2 or 3 leafs. Of course a

1460. nger tree can be empty or contain only one element as leaf.

1461. 342 CHAPTER 12. SEQUENCES, THE LAST BRICK So the de

1462. nition of a

1463. nger tree is speci

1464. ed like this. A

1465. nger tree is either empty; or a singleton leaf; or contains three parts: a left

1466. nger which is a list contains at most 3 elements; a sub

1467. nger tree; and a right

1468. nger which is also a list contains at most 3 elements. Note that this de

1469. nition is recursive, so it's quite possible to be translated to functional settings. The following Haskell de

1470. nition summaries these cases for example. data Tree a = Empty j Lf a j Tr [a] (Tree (Node a)) [a] In imperative settings, we can de

1471. ne the

1472. nger tree in a similar manner. What's more, we can add a parent

1473. eld, so that it's possible to back-track to root from any tree node. Below ANSI C code de

1474. nes

1475. nger tree accordingly. struct Tree { union Node front; union Node rear; Tree mid; Tree parent; }; We can use NIL pointer to represent an empty tree; and a leaf tree contains only one element in its front

1476. nger, both its rear

1477. nger and middle part are empty; Figure 12.8 and 12.9 show some examples of

1478. gure tree. NIL (a) An empty tree a (b) A singleton leaf b NIL a (c) Front

1479. nger and rear

1480. nger contain one element for each, the middle part is empty Figure 12.8: Examples of

1481. nger tree, 1 The

1482. rst example is an empty

1483. nger tree; the second one shows the result after inserting one element to empty, it becomes a leaf of one node; the third

1484. 12.6. FINGER TREE 343 e d c b NIL a (a) After inserting extra 3 elements to front

1485. nger, it exceeds the 2-3 tree constraint, which isn't balanced any more f e a d c b (b) The tree resumes balancing. There are 2 elements in front

1486. nger; The middle part is a leaf, which contains a 3- branches 2-3 tree. Figure 12.9: Examples of

1487. nger tree, 2 example shows a

1488. nger tree contains 2 elements, one is in front

1489. nger, the other is in rear; If we continuously insert new elements, to the tree, those elements will be put in the front

1490. nger one by one, until it exceeds the limit of 2-3 tree. The 4-th example shows such condition, that there are 4 elements in front

1491. nger, which isn't balanced any more. The last example shows that the

1492. nger tree gets

1493. xed so that it resumes balancing. There are two elements in the front

1494. nger. Note that the middle part is not empty any longer. It's a leaf of a 2-3 tree. The content of the leaf is a tree with 3 branches, each contains an element. We can express these 5 examples as the following Haskell expression. Empty Lf a [b] Empty [a] [e, d, c, b] Empty [a] [f, e] Lf (Br3 d c b) [a] As we mentioned that the de

1495. nition of

1496. nger tree is recursive. The middle part besides the front and rear

1497. nger is a deeper

1498. nger tree, which is de

1499. ned as Tree(Node(a)). Every time we go deeper, the Node() is embedded one more level. if the element type of the

1500. rst level tree is a, the element type for the second level tree is Node(a), the third level is Node(Node(a)), ..., the n-th level is Node(Node(Node(:::(a)):::)) = Noden(a), where n indicates the Node() is applied n times. 12.6.2 Insert element to the head of sequence The examples list above actually reveal the typical process that the elements are inserted one by one to a

1501. nger tree. It's possible to summarize these examples to some cases for insertion on head algorithm. When we insert an element x to a

1502. nger tree T,

1503. 344 CHAPTER 12. SEQUENCES, THE LAST BRICK If the tree is empty, the result is a leaf which contains the singleton element x; If the tree is a singleton leaf of element y, the result is a new

1504. nger tree. The front

1505. nger contains the new element x, the rear

1506. nger contains the previous element y; the middle part is a empty

1507. nger tree; If the number of elements stored in front

1508. nger isn't bigger than the upper limit of 2-3 tree, which is 3, the new element is just inserted to the head of front

1509. nger; otherwise, it means that the number of elements stored in front

1510. nger exceeds the upper limit of 2-3 tree. the last 3 elements in front

1511. nger is wrapped in a 2-3 tree and recursively inserted to the middle part. the new element x is inserted in front of the rest elements in front

1512. nger. Suppose that function leaf(x) creates a leaf of element x, function tree(F; T0;R) creates a

1513. nger tree from three part: F is the front

1514. nger, which is a list contains several elements. Similarity, R is the rear

1515. nger, which is also a list. T0 is the middle part which is a deeper

1516. nger tree. Function tr3(a; b; c) creates a 2-3 tree from 3 elements a; b; c; while tr2(a; b) creates a 2-3 tree from 2 elements a and b. insertT (x; T) = 8 : leaf(x) : T = tree(fxg;; fyg) : T = leaf(y) tree(fx; x1g; insertT (tr3(x2; x3; x4); T0);R) : T = tree(fx1; x2; x3; x4g; T0;R) tree(fxg [ F; T0;R) : otherwise (12.25) The performance of this algorithm is dominated by the recursive case. All the other cases are constant O(1) time. The recursion depth is proportion to the height of the tree, so the algorithm is bound to O(h) time, where h is the height. As we use 2-3 tree to ensure that the tree is well balanced, h = O(lgN), where N is the number of elements stored in the

1517. nger tree. More analysis reveal that the amortized performance of insertT () is O(1) because we can amortize the expensive recursion case to other trivial cases. Please refer to [6] and [5] for the detailed proof. Translating the algorithm yields the below Haskell program. cons :: a ! Tree a ! Tree a cons a Empty = Lf a cons a (Lf b) = Tr [a] Empty [b] cons a (Tr [b, c, d, e] m r) = Tr [a, b] (cons (Br3 c d e) m) r cons a (Tr f m r) = Tr (a:f) m r Here we use the LISP naming convention to illustrate inserting a new element to a list. The insertion algorithm can also be implemented in imperative approach. Suppose function Tree() creates an empty tree, that all

1518. elds, including front and rear

1519. nger, the middle part inner tree and parent are empty. Function Node() creates an empty node. function Prepend-Node(n; T) r Tree()

1520. 12.6. FINGER TREE 345 p r Connect-Mid(p; T) while Full?(Front(T)) do F Front(T) . F = fn1; n2; n3; :::g Front(T) fn; F[1]g . F[1] = n1 n Node() Children(n) F[2::] . F[2::] = fn2; n3; :::g p T T Mid(T) if T = NIL then T Tree() Front(T) fng else if j Front(T) j = 1 ^ Rear(T) = then Rear(T) Front(T) Front(T) fng else Front(T) fng[ Front(T) Connect-Mid(p; T) T return Flat(r) Where the notation L[i::] means a sub list of L with the

1521. rst i 1 elements removed, that if L = fa1; a2; :::; ang, L[i::] = fai; ai+1; :::; ang. Functions Front(), Rear(), Mid(), and Parent() are used to access the front

1522. nger, the rear

1523. nger, the middle part inner tree and the parent tree respectively; Function Children() accesses the children of a node. Function Connect-Mid(T1; T2), connect T2 as the inner middle part tree of T1, and set the parent of T2 as T1 if T2 isn't empty. In this algorithm, we performs a one pass top-down traverse along the middle part inner tree if the front

1524. nger is full that it can't aord to store any more. The criteria for full for a 2-3 tree is that the

1525. nger contains 3 elements already. In such case, we extract all the elements except the

1526. rst one o, wrap them to a new node (one level deeper node), and continuously insert this new node to its middle inner tree. The

1527. rst element is left in the front

1528. nger, and the element to be inserted is put in front of it, so that this element becomes the new

1529. rst one in the front

1530. nger. After this traversal, the algorithm either reach an empty tree, or the tree still has room to hold more element in its front

1531. nger. We create a new leaf for the former case, and perform a trivial list insert to the front

1532. nger for the latter. During the traversal, we use p to record the parent of the current tree we are processing. So any new created tree are connected as the middle part inner tree to p. Finally, we return the root of the tree r. The last trick of this algorithm is the Flat() function. In order to simplify the logic, we create an empty `ground' tree and set it as the parent of the root. We need eliminate this extra `ground' level before return the root. This atten algorithm is realized as the following. function Flat(T) while T6= NIL ^T is empty do T Mid(T)

1533. 346 CHAPTER 12. SEQUENCES, THE LAST BRICK if T6= then Parent(T) return T The while loop test if T is trivial empty, that it's not NIL(= ), while both its front and rear

1534. ngers are empty. Below Python code implements the insertion algorithm for

1535. nger tree. def insert(x, t): return prepend_node(wrap(x), t) def prepend_node(n, t): root = prev = Tree() prev.set_mid(t) while frontFull(t): f = t.front t.front = [n] + f[:1] n = wraps(f[1:]) prev = t t = t.mid if t is None: t = leaf(n) elif len(t.front)==1 and t.rear == []: t = Tree([n], None, t.front) else: t = Tree([n]+t.front, t.mid, t.rear) prev.set_mid(t) return flat(root) def flat(t): while t is not None and t.empty(): t = t.mid if t is not None: t.parent = None return t The implementation of function 'set_mid()', 'frontFull()', 'wrap()', 'wraps()', 'empty()', and tree constructor are trivial enough, that we skip the detail of them here. Readers can take these as exercises. 12.6.3 Remove element from the head of sequence It's easy to implement the reverse operation that remove the

1536. rst element from the list by reversing the insertT () algorithm line by line. Let's denote F = ff1; f2; :::g is the front

1537. nger list, M is the middle part inner

1538. nger tree. R = fr1; r2; :::g is the rear

1539. nger list of a

1540. nger tree, and R0 = fr2; r3; :::g is the rest of element with the

1541. rst one removed from R. extractT (T) = 8 : (x; ) : T = leaf(x) (x; leaf(y)) : T = tree(fxg;; fyg) (x; tree(fr1g;;R0)) : T = tree(fxg;;R) (x; tree(toList(F0);M0;R)) : T = tree(fxg;M;R); (F0;M0) = extractT (M) (f1; tree(ff2; f3; :::g;M;R)) : otherwise (12.26)

1542. 12.6. FINGER TREE 347 Where function toList(T) converts a 2-3 tree to plain list as the following. toList(T) = fx; yg : T = tr2(x; y) fx; y; zg : T = tr3(x; y; z) (12.27) Here we skip the error handling such as trying to remove element from empty tree etc. If the

1543. nger tree is a leaf, the result after removal is an empty tree; If the

1544. nger tree contains two elements, one in the front rear, the other in rear, we return the element stored in front rear as the

1545. rst element, and the resulted tree after removal is a leaf; If there is only one element in front

1546. nger, the middle part inner tree is empty, and the rear

1547. nger isn't empty, we return the only element in front

1548. nger, and borrow one element from the rear

1549. nger to front; If there is only one element in front

1550. nger, however, the middle part inner tree isn't empty, we can recursively remove a node from the inner tree, and atten it to a plain list to replace the front

1551. nger, and remove the original only element in front

1552. nger; The last case says that if the front

1553. nger contains more than one element, we can just remove the

1554. rst element from front

1555. nger and keep all the other part unchanged. Figure 12.10 shows the steps of removing two elements from the head of a sequence. There are 10 elements stored in the

1556. nger tree. When the

1557. rst element is removed, there is still one element left in the front

1558. nger. However, when the next element is removed, the front

1559. nger is empty. So we `borrow' one tree node from the middle part inner tree. This is a 2-3 tree. it is converted to a list of 3 elements, and the list is used as the new

1560. nger. the middle part inner tree change from three parts to a singleton leaf, which contains only one 2-3 tree node. There are three elements stored in this tree node. Below is the corresponding Haskell program for `uncons'. uncons :: Tree a ! (a, Tree a) uncons (Lf a) = (a, Empty) uncons (Tr [a] Empty [b]) = (a, Lf b) uncons (Tr [a] Empty (r:rs)) = (a, Tr [r] Empty rs) uncons (Tr [a] m r) = (a, Tr (nodeToList f) m' r) where (f, m') = uncons m uncons (Tr f m r) = (head f, Tr (tail f) m r) And the function nodeT oList is de

1561. ned like this. nodeToList :: Node a ! [a] nodeToList (Br2 a b) = [a, b] nodeToList (Br3 a b c) = [a, b, c] Similar as above, we can de

1562. ne head and tail function from uncons. head = fst uncons tail = snd uncons 12.6.4 Handling the ill-formed

1563. nger tree when removing The strategy used so far to remove element from

1564. nger tree is a kind of removing and borrowing. If the front

1565. nger becomes empty after removing, we borrows more nodes from the middle part inner tree. However there exists cases that the tree is ill-formed, for example, both the front

1566. ngers of the tree and its middle part inner tree are empty. Such ill-formed tree can result from imperatively splitting, which we'll introduce later.

1567. 348 CHAPTER 12. SEQUENCES, THE LAST BRICK x[10] x[9] x[2] x[1] NIL x[8] x[7] x[6] x[5] x[4] x[3] (a) A sequence of 10 elements represented as a

1568. n- ger tree x[9] x[2] x[1] NIL x[8] x[7] x[6] x[5] x[4] x[3] (b) The

1569. rst element is removed. There is one element left in front

1570. nger. x[8] x[7] x[6] x[2] x[1] x[5] x[4] x[3] (c) Another element is remove from head. We borrowed one node from the middle part inner tree, change the node, which is a 2-3 tree to a list, and use it as the new front

1571. nger. the middle part inner tree becomes a leaf of one 2-3 tree node. Figure 12.10: Examples of remove 2 elements from the head of a sequence

1572. 12.6. FINGER TREE 349 1 [] 2 r[1][1] r[1][2] ... [] 3 r[2][1] r[2][2] ... ... i n[i][1] n[i][2] ... r[i][1] r[i][2] ... ... Figure 12.11: Example of an ill-formed tree. The front

1573. nger of the i-th level sub tree isn't empty. Here we developed an imperative algorithm which can remove the

1574. rst ele-ment from

1575. nger tree even it is ill-formed. The idea is

1576. rst perform a top-down traverse to

1577. nd a sub tree which either has a non-empty front

1578. nger or both its front

1579. nger and middle part inner tree are empty. For the former case, we can safely extract the

1580. rst element which is a node from the front

1581. nger; For the latter case, since only the rear

1582. nger isn't empty, we can swap it with the empty front

1583. nger, and change it to the former case. After that, we need examine the node we extracted from the front

1584. nger is leaf node (How to do that? this is left as an exercise to the reader). If not, we need go on extracting the

1585. rst sub node from the children of this node, and left the rest of other children as the new front

1586. nger to the parent of the current tree. We need repeatedly go up along with the parent

1587. eld till the node we extracted is a leaf. At that time point, we arrive at the root of the tree. Figure 12.12 illustrates this process. Based on this idea, the following algorithm realizes the removal operation on head. The algorithm assumes that the tree passed in isn't empty. function Extract-Head(T) r Tree() Connect-Mid(r; T) while Front(T) = ^ Mid(T)6= NIL do T Mid(T) if Front(T) = ^ Rear(T)6= then Exchange Front(T) $ Rear(T) n Node() Children(n) Front(T) repeat L Children(n) . L = fn1; n2; n3; :::g n L[1] . n n1

1588. 350 CHAPTER 12. SEQUENCES, THE LAST BRICK 1 [] 2 r[1][1] r[1][2] ... [] 3 r[2][1] r[2][2] ... ... i children of n[i][1]= n[i-1][1] n[i-1][2] ... r[i-1][1] r[i-1][2] ... i+1 n[i][2] ... r[i][1] r[i][2] ... ... (a) Extract the

1589. rst element n[i][1] and put its children to the front

1590. nger of upper level tree. x[1] is extracted 1 x[2] x[3] ... 2 r[1][1] r[1][2] ... n[2][2] n[2][3] ... 3 r[2][1] r[2][2] ... ... i n[i-1][2] n[i-1][3] ... r[i-1][1] r[i-1][2] ... i+1 n[i][2] ... r[i][1] r[i][2] ... ... (b) Repeat this process i times, and

1591. nally x[1] is extracted. Figure 12.12: Traverse bottom-up till a leaf is extracted.

1592. 12.6. FINGER TREE 351 Front(T) L[2::] . L[2::] = fn2; n3; :::g T Parent(T) if Mid(T) becomes empty then Mid(T) NIL until n is a leaf return (Elem(n), Flat(r)) Note that function Elem(n) returns the only element stored inside leaf node n. Similar as imperative insertion algorithm, a stub `ground' tree is used as the parent of the root, which can simplify the logic a bit. That's why we need atten the tree

1593. nally. Below Python program translates the algorithm. def extract_head(t): root = Tree() root.set_mid(t) while t.front == [] and t.mid is not None: t = t.mid if t.front == [] and t.rear != []: (t.front, t.rear) = (t.rear, t.front) n = wraps(t.front) while True: # a repeat-until loop ns = n.children n = ns[0] t.front = ns[1:] t = t.parent if t.mid.empty(): t.mid.parent = None t.mid = None if n.leaf: break return (elem(n), flat(root)) Member function Tree.empty() returns true if all the three parts - the front

1594. nger, the rear

1595. nger and the middle part inner tree - are empty. We put a ag Node.leaf to mark if a node is a leaf or compound node. The exercise of this section asks the reader to consider some alternatives. As the ill-formed tree is allowed, the algorithms to access the

1596. rst and last element of the

1597. nger tree must be modi

1598. ed, so that they don't blindly return the

1599. rst or last child of the

1600. nger as the

1601. nger can be empty if the tree is ill-formed. The idea is quite similar to the Extract-Head, that in case the

1602. nger is empty while the middle part inner tree isn't, we need traverse along with the inner tree till a point that either the

1603. nger becomes non-empty or all the nodes are stored in the other

1604. nger. For instance, the following algorithm can return the

1605. rst leaf node even the tree is ill-formed. function First-Lf(T) while Front(T) = ^ Mid(T)6= NIL do T Mid(T) if Front(T) = ^ Rear(T)6= then n Rear(T)[1] else n Front(T)[1]

1606. 352 CHAPTER 12. SEQUENCES, THE LAST BRICK while n is NOT leaf do n Children(n)[1] return n Note the second loop in this algorithm that it keeps traversing on the

1607. rst sub-node if current node isn't a leaf. So we always get a leaf node and it's trivial to get the element inside it. function First(T) return Elem(First-Lf(T)) The following Python code translates the algorithm to real program. def first(t): return elem(first_leaf(t)) def first_leaf(t): while t.front == [] and t.mid is not None: t = t.mid if t.front == [] and t.rear != []: n = t.rear[0] else: n = t.front[0] while not n.leaf: n = n.children[0] return n To access the last element is quite similar, and we left it as an exercise to the reader. 12.6.5 append element to the tail of the sequence Because

1608. nger tree is symmetric, we can give the realization of appending ele-ment on tail by referencing to insertT () algorithm. appendT (T; x) = 8 : leaf(x) : T = tree(fyg;; fxg) : T = leaf(y) tree(F; appendT (M; tr3(x1; x2; x3)); fx4; xg) : T = tree(F;M; fx1; x2; x3; x4g) tree(F;M;R [ fxg) : otherwise (12.28) Generally speaking, if the rear

1609. nger is still valid 2-3 tree, that the number of elements is not greater than 4, the new elements is directly appended to rear

1610. nger. Otherwise, we break the rear

1611. nger, take the

1612. rst 3 elements in rear

1613. nger to create a new 2-3 tree, and recursively append it to the middle part inner tree. If the

1614. nger tree is empty or a singleton leaf, it will be handled in the

1615. rst two cases. Translating the equation to Haskell yields the below program. snoc :: Tree a ! a ! Tree a snoc Empty a = Lf a snoc (Lf a) b = Tr [a] Empty [b] snoc (Tr f m [a, b, c, d]) e = Tr f (snoc m (Br3 a b c)) [d, e] snoc (Tr f m r) a = Tr f m (r++[a])

1616. 12.6. FINGER TREE 353 Function name `snoc' is mirror of 'cons', which indicates the symmetric relationship. Appending new element to the end imperatively is quite similar. The fol-lowing algorithm realizes appending. function Append-Node(T; n) r Tree() p r Connect-Mid(p; T) while Full?(Rear(T)) do R Rear(T) . R = fn1; n2; :::; ; nm1; nmg Rear(T) fn; Last(R) g . last element nm n Node() Children(n) R[1:::m 1] . fn1; n2; :::; nm1g p T T Mid(T) if T = NIL then T Tree() Front(T) fng else if j Rear(T) j = 1 ^ Front(T) = then Front(T) Rear(T) Rear(T) fng else Rear(T) Rear(T) [fng Connect-Mid(p; T) T return Flat(r) And the corresponding Python programs is given as below. def append_node(t, n): root = prev = Tree() prev.set_mid(t) while rearFull(t): r = t.rear t.rear = r[-1:] + [n] n = wraps(r[:-1]) prev = t t = t.mid if t is None: t = leaf(n) elif len(t.rear) == 1 and t.front == []: t = Tree(t.rear, None, [n]) else: t = Tree(t.front, t.mid, t.rear + [n]) prev.set_mid(t) return flat(root) 12.6.6 remove element from the tail of the sequence Similar to appendT (), we can realize the algorithm which remove the last ele-ment from

1617. nger tree in symmetric manner of extractT ().

1618. 354 CHAPTER 12. SEQUENCES, THE LAST BRICK We denote the non-empty, non-leaf

1619. nger tree as tree(F;M;R), where F is the front

1620. nger, M is the middle part inner tree, and R is the rear

1621. nger. removeT (T) = 8 : (; x) : T = leaf(x) (leaf(y); x) : T = tree(fyg;; fxg) (tree(init(F);; last(F)); x) : T = tree(F;; fxg) ^ F6= (tree(F;M0; toList(R0)); x) : T = tree(F;M; fxg); (M0;R0) = removeT (M) (tree(F;M; init(R)); last(R)) : otherwise (12.29) Function toList(T) is used to atten a 2-3 tree to plain list, which is de

1622. ned previously. Function init(L) returns all elements except for the last one in list L, that if L = fa1; a2; :::; an1; ang, init(L) = fa1; a2; :::; an1g. And Function last(L) returns the last element, so that last(L) = an. Please refer to the appendix of this book for their implementation. Algorithm removeT () can be translated to the following Haskell program, we name it as `unsnoc' to indicate it's the reverse function of `snoc'. unsnoc :: Tree a ! (Tree a, a) unsnoc (Lf a) = (Empty, a) unsnoc (Tr [a] Empty [b]) = (Lf a, b) unsnoc (Tr f@(_:_) Empty [a]) = (Tr (init f) Empty [last f], a) unsnoc (Tr f m [a]) = (Tr f m' (nodeToList r), a) where (m', r) = unsnoc m unsnoc (Tr f m r) = (Tr f m (init r), last r) And we can de

1623. ne a special function `last' and 'init' for

1624. nger tree which is similar to their counterpart for list. last = snd unsnoc init = fst unsnoc Imperatively removing the element from the end is almost as same as remov-ing on the head. Although there seems to be a special case, that as we always store the only element (or sub node) in the front

1625. nger while the rear

1626. nger and middle part inner tree are empty (e.g. Tree(fng;NIL; )), it might get nothing if always try to fetch the last element from rear

1627. nger. This can be solved by swapping the front and the rear

1628. nger if the rear is empty as in the following algorithm. function Extract-Tail(T) r Tree() Connect-Mid(r; T) while Rear(T) = ^ Mid(T)6= NIL do T Mid(T) if Rear(T) = ^ Front(T)6= then Exchange Front(T) $ Rear(T) n Node() Children(n) Rear(T) repeat L Children(n) . L = fn1; n2; :::; nm1; nmg n Last(L) . n nm Rear(T) L[1:::m 1] . fn1; n2; :::; nm1g T Parent(T)

1629. 12.6. FINGER TREE 355 if Mid(T) becomes empty then Mid(T) NIL until n is a leaf return (Elem(n), Flat(r)) How to access the last element as well as implement this algorithm to working program are left as exercises. 12.6.7 concatenate Consider the none-trivial case that concatenate two

1630. nger trees T1 = tree(F1;M1;R1) and T2 = tree(F2;M2;R2). One natural idea is to use F1 as the new front

1631. nger for the concatenated result, and keep R2 being the new rear

1632. nger. The rest of work is to merge M1, R1, F2 and M2 to a new middle part inner tree. Note that both R1 and F2 are plain lists of node, so the sub-problem is to realize a algorithm like this. merge(M1;R1 [ F2;M2) =? More observation reveals that both M1 and M2 are also

1633. nger trees, except that they are one level deeper than T1 and T2 in terms of Node(a), where a is the type of element stored in the tree. We can recursively use the strategy that keep the front

1634. nger of M1 and the rear

1635. nger of M2, then merge the middle part inner tree of M1, M2, as well as the rear

1636. nger of M1 and front

1637. nger of M2. If we denote function front(T) returns the front

1638. nger, rear(T) returns the rear

1639. nger, mid(T) returns the middle part inner tree. the above merge() algorithm can be expressed for non-trivial case as the following. merge(M1;R1 [ F2;M2) = tree(front(M1); S; rear(M2)) S = merge(mid(M1); rear(M1) [ R1 [ F2 [ front(M2); mid(M2)) (12.30) If we look back to the original concatenate solution, it can be expressed as below. concat(T1; T2) = tree(F1; merge(M1;R1 [ R2;M2);R2) (12.31) And compare it with equation 12.30, it's easy to note the fact that concate-nating is essentially merging. So we have the

1640. nal algorithm like this. concat(T1; T2) = merge(T1;; T2) (12.32) By adding edge cases, the merge() algorithm can be completed as below. merge(T1; S; T2) = 8 : foldR(insertT; T2; S) : T1 = foldL(appendT; T1; S) : T2 = merge(; fxg [ S; T2) : T1 = leaf(x) merge(T1; S [ fxg; ) : T2 = leaf(x) tree(F1; merge(M1; nodes(R1 [ S [ F2);M2);R2) : otherwise (12.33)

1641. 356 CHAPTER 12. SEQUENCES, THE LAST BRICK Most of these cases are straightforward. If any one of T1 or T2 is empty, the algorithm repeatedly insert/append all elements in S to the other tree; Function foldL and foldR are kinds of for-each process in imperative settings. The dierence is that foldL processes the list S from left to right while foldR processes from right to left. Here are their de

1642. nition. Suppose list L = fa1; a2; :::; an1; ang, L0 = fa2; a3; :::; an1; ang is the rest of elements except for the

1643. rst one. foldL(f; e;L) = e : L = foldL(f; f(e; a1);L0) : otherwise (12.34) foldR(f; e;L) = e : L = f(a1; foldR(f; e;L0)) : otherwise (12.35) They are detailed explained in the appendix of this book. If either one of the tree is a leaf, we can insert or append the element of this leaf to S, so that it becomes the trivial case of concatenating one empty tree with another. Function nodes() is used to wrap a list of elements to a list of 2-3 trees. This is because the contents of middle part inner tree, compare to the contents of

1644. nger, are one level deeper in terms of Node(). Consider the time point that transforms from recursive case to edge case. Let's suppose M1 is empty at that time, we then need repeatedly insert all elements from R1 [ S [ F2 to M2. However, we can't directly do the insertion. If the element type is a, we can only insert Node(a) which is 2-3 tree to M2. This is just like what we did in the insertT () algorithm, take out the last 3 elements, wrap them in a 2-3 tree, and recursive perform insertT (). Here is the de

1645. nition of nodes(). nodes(L) = 8 : ftr2(x1; x2)g : L = fx1; x2g ftr3(x1; x2; x3)g : L = fx1; x2; x3g ftr2(x1; x2); tr2(x3; x4)g : L = fx1; x2; x3; x4g ftr3(x1; x2; x3)g [ nodes(fx4; x5; :::g) : otherwise (12.36) Function nodes() follows the constraint of 2-3 tree, that if there are only 2 or 3 elements in the list, it just wrap them in singleton list contains a 2-3 tree; If there are 4 elements in the lists, it split them into two trees each is consist of 2 branches; Otherwise, if there are more elements than 4, it wraps the

1646. rst three in to one tree with 3 branches, and recursively call nodes() to process the rest. The performance of concatenation is determined by merging. Analyze the recursive case of merging reveals that the depth of recursion is proportion to the smaller height of the two trees. As the tree is ensured to be balanced by using 2-3 tree. it's height is bound to O(lgN0) where N0 is the number of elements. The edge case of merging performs as same as insertion, (It calls insertT () at most 8 times) which is amortized O(1) time, and O(lgM) at worst case, where M is the dierence in height of the two trees. So the overall performance is bound to O(lgN), where N is the total number of elements contains in two

1647. nger trees. The following Haskell program implements the concatenation algorithm.

1648. 12.6. FINGER TREE 357 concat :: Tree a ! Tree a ! Tree a concat t1 t2 = merge t1 [] t2 Note that there is `concat' function de

1649. ned in prelude standard library, so we need distinct them either by hiding import or take a dierent name. merge :: Tree a ! [a] ! Tree a ! Tree a merge Empty ts t2 = foldr cons t2 ts merge t1 ts Empty = foldl snoc t1 ts merge (Lf a) ts t2 = merge Empty (a:ts) t2 merge t1 ts (Lf a) = merge t1 (ts++[a]) Empty merge (Tr f1 m1 r1) ts (Tr f2 m2 r2) = Tr f1 (merge m1 (nodes (r1 ++ ts ++ f2)) m2) r2 And the implementation of nodes() is as below. nodes :: [a] ! [Node a] nodes [a, b] = [Br2 a b] nodes [a, b, c] = [Br3 a b c] nodes [a, b, c, d] = [Br2 a b, Br2 c d] nodes (a:b:c:xs) = Br3 a b c:nodes xs To concatenate two

1650. nger trees T1 and T2 in imperative approach, we can traverse the two trees along with the middle part inner tree till either tree turns to be empty. In every iteration, we create a new tree T, choose the front

1651. nger of T1 as the front

1652. nger of T; and choose the rear

1653. nger of T2 as the rear

1654. nger of T. The other two

1655. ngers (rear

1656. nger of T1 and front

1657. nger of T2) are put together as a list, and this list is then balanced grouped to several 2-3 tree nodes as N. Note that N grows along with traversing not only in terms of length, the depth of its elements increases by one in each iteration. We attach this new tree as the middle part inner tree of the upper level result tree to end this iteration. Once either tree becomes empty, we stop traversing, and repeatedly insert the 2-3 tree nodes in N to the other non-empty tree, and set it as the new middle part inner tree of the upper level result. Below algorithm describes this process in detail. function Concat(T1; T2) return Merge(T1;; T2) function Merge(T1;N; T2) r Tree() p r while T16= NIL ^T26= NIL do T Tree() Front(T) Front(T1) Rear(T) Rear(T2) Connect-Mid(p; T) p T N Nodes(Rear(T1) [N[ Front(T2)) T1 Mid(T1) T2 Mid(T2) if T1 = NIL then T T2 for each n 2 Reverse(N) do

1658. 358 CHAPTER 12. SEQUENCES, THE LAST BRICK T Prepend-Node(n; T) else if T2 = NIL then T T1 for each n 2 N do T Append-Node(T; n) Connect-Mid(p; T) return Flat(r) Note that the for-each loops in the algorithm can also be replaced by folding from left and right respectively. Translating this algorithm to Python program yields the below code. def concat(t1, t2): return merge(t1, [], t2) def merge(t1, ns, t2): root = prev = Tree() #sentinel dummy tree while t1 is not None and t2 is not None: t = Tree(t1.size + t2.size + sizeNs(ns), t1.front, None, t2.rear) prev.set_mid(t) prev = t ns = nodes(t1.rear + ns + t2.front) t1 = t1.mid t2 = t2.mid if t1 is None: prev.set_mid(foldR(prepend_node, ns, t2)) elif t2 is None: prev.set_mid(reduce(append_node, ns, t1)) return flat(root) Because Python only provides folding function from left as reduce(), a folding function from right is given like what we shown in pseudo code, that it repeatedly applies function in reverse order of the list. def foldR(f, xs, z): for x in reversed(xs): z = f(x, z) return z The only function in question is how to balanced-group nodes to bigger 2-3 trees. As a 2-3 tree can hold at most 3 sub trees, we can

1659. rstly take 3 nodes and wrap them to a ternary tree if there are more than 4 nodes in the list and continuously deal with the rest. If there are just 4 nodes, they can be wrapped to two binary trees. For other cases (there are 3 trees, 2 trees, 1 tree), we simply wrap them all to a tree. Denote node list L = fn1; n2; :::g, The following algorithm realizes this pro-cess. function Nodes(L) N = while jLj 4 do n Node() Children(n) L[1::3] . fn1; n2; n3g N N [ fng L L[4:::] . fn4; n5; :::g

1660. 12.6. FINGER TREE 359 if jLj = 4 then x Node() Children(x) fL[1];L[2]g y Node() Children(y) fL[3];L[4]g N N [ fx; yg else if L6= then n Node() Children(n) L N N [ fng return N It's straight forward to translate the algorithm to below Python program. Where function wraps() helps to create an empty node, then set a list as the children of this node. def nodes(xs): res = [] while len(xs) 4: res.append(wraps(xs[:3])) xs = xs[3:] if len(xs) == 4: res.append(wraps(xs[:2])) res.append(wraps(xs[2:])) elif xs != []: res.append(wraps(xs)) return res Exercise 12.5 1. Implement the complete

1661. nger tree insertion program in your favorite imperative programming language. Don't check the example programs along with this chapter before having a try. 2. How to determine a node is a leaf? Does it contain only a raw element inside or a compound node, which contains sub nodes as children? Note that we can't distinguish it by testing the size, as there is case that node contains a singleton leaf, such as node(1; fnode(1; fxgg). Try to solve this problem in both dynamic typed language (e.g. Python, lisp etc) and in strong static typed language (e.g. C++). 3. Implement the Extract-Tail algorithm in your favorite imperative pro-gramming language. 4. Realize algorithm to return the last element of a

1662. nger tree in both func-tional and imperative approach. The later one should be able to handle ill-formed tree. 5. Try to implement concatenation algorithm without using folding. You can either use recursive methods, or use imperative for-each method.

1663. 360 CHAPTER 12. SEQUENCES, THE LAST BRICK 12.6.8 Random access of

1664. nger tree size augmentation The strategy to provide fast random access, is to turn the looking up into tree-search. In order to avoid calculating the size of tree many times, we augment an extra

1665. eld to tree and node. The de

1666. nition should be modi

1667. ed accordingly, for example the following Haskell de

1668. nition adds size

1669. eld in its constructor. data Tree a = Empty j Lf a j Tr Int [a] (Tree (Node a)) [a] And the previous ANSI C structure is augmented with size as well. struct Tree { union Node front; union Node rear; Tree mid; Tree parent; int size; }; Suppose the function tree(s; F;M;R) creates a

1670. nger tree from size s, front

1671. nger F, rear

1672. nger R, and middle part inner tree M. When the size of the tree is needed, we can call a size(T) function. It will be something like this. size(T) = 8 : 0 : T = ? : T = leaf(x) s : T = tree(s; F;M;R) If the tree is empty, the size is de

1673. nitely zero; and if it can be expressed as tree(s; F;M;R), the size is s; however, what if the tree is a singleton leaf? is it 1? No, it can be 1 only if T = leaf(a) and a isn't a tree node, but a raw element stored in

1674. nger tree. In most cases, the size is not 1, because a can be again a tree node. That's why we put a `?' in above equation. The correct way is to call some size function on the tree node as the following. size(T) = 8 : 0 : T = size0(x) : T = leaf(x) s : T = tree(s; F;M;R) (12.37) Note that this isn't a recursive de

1675. nition since size6= size0, the argument to size0 is either a tree node, which is a 2-3 tree, or a plain element stored in the

1676. nger tree. To uniform these two cases, we can anyway wrap the single plain element to a tree node of only one element. So that we can express all the situation as a tree node augmented with a size

1677. eld. The following Haskell program modi

1678. es the de

1679. nition of tree node. data Node a = Br Int [a] The ANSI C node de

1680. nition is modi

1681. ed accordingly. struct Node { Key key; struct Node children; int size; };

1682. 12.6. FINGER TREE 361 We change it from union to structure. Although there is a overhead

1683. eld `key' if the node isn't a leaf. Suppose function tr(s;L), creates such a node (either one element being wrapped or a 2-3 tree) from a size information s, and a list L. Here are some example. tr(1; fxg) a tree contains only one element tr(2; fx; yg) a 2-3 tree contains two elements tr(3; fx; y; zg) a 2-3 tree contains three elements So the function size0 can be implemented as returning the size information of a tree node. We have size0(tr(s;L)) = s. Wrapping an element x is just calling tr(1; fxg). We can de

1684. ne auxiliary functions wrap and unwrap, for instance. wrap(x) = tr(1; fxg) unwrap(n) = x : n = tr(1; fxg) (12.38) As both front

1685. nger and rear

1686. nger are lists of tree nodes, in order to calcu-late the total size of

1687. nger, we can provide a size00(L) function, which sums up size of all nodes stored in the list. Denote L = fa1; a2; :::g and L0 = fa2; a3; :::g. 00 (L) = size 0 : L = size0(a1) + size00(L0) : otherwise (12.39) It's quite OK to de

1688. ne size00(L) by using some high order functions. For example. 00 (L) = sum(map(size size 0 ;L)) (12.40) And we can turn a list of tree nodes into one deeper 2-3 tree and vice-versa. wraps(L) = tr(size00(L);L) unwraps(n) = L : n = tr(s;L) (12.41) These helper functions are translated to the following Haskell code. size (Br s _) = s sizeL = sum (map size) sizeT Empty = 0 sizeT (Lf a) = size a sizeT (Tr s _ _ _) = s Here are the wrap and unwrap auxiliary functions. wrap x = Br 1 [x] unwrap (Br 1 [x]) = x wraps xs = Br (sizeL xs) xs unwraps (Br _ xs) = xs We omitted their type de

1689. nitions for illustration purpose. In imperative settings, the size information for node and tree can be accessed through the size

1690. eld. And the size of a list of nodes can be summed up for this

1691. eld as the below algorithm.

1692. 362 CHAPTER 12. SEQUENCES, THE LAST BRICK function Size-Nodes(L) s 0 for 8n 2 L do s s+ Size(n) return s The following Python code, for example, translates this algorithm by using standard sum() and map() functions provided in library. def sizeNs(xs): return sum(map(lambda x: x.size, xs)) As NIL is typically used to represent empty tree in imperative settings, it's convenient to provide a auxiliary size function to uniformed calculate the size of tree no matter it is NIL. function Size-Tr(T) if T = NIL then return 0 else return Size(T) The algorithm is trivial and we skip its implementation example program. Modi

1693. cation due to the augmented size The algorithms have been presented so far need to be modi

1694. ed to accomplish with the augmented size. For example the insertT () function now inserts a tree node instead of a plain element. insertT (x; T) = insertT 0 (wrap(x); T) (12.42) The corresponding Haskell program is changed as below. cons a t = cons' (wrap a) t After being wrapped, x is augmented with size information of 1. In the implementation of previous insertion algorithm, function tree(F;M;R) is used to create a

1695. nger tree from a front

1696. nger, a middle part inner tree and a rear

1697. nger. This function should also be modi

1698. ed to add size information of these three arguments. 0 (F;M;R) = tree 8 : fromL(F) : M = ^ R = fromL(R) : M = ^ F = tree0(unwraps(F0);M0;R) : F = ; (F0;M0) = extractT 0(Mtree0(F;M0; unwraps(R0)) : R = ; (M0;R0) = removeT 0(Mtree(size00(F) + size(M) + size00(R); F;M;R) : otherwise (12.43) Where fromL() helps to turn a list of nodes to a

1699. nger tree by repeatedly inserting all the element one by one to an empty tree. fromL(L) = foldR(insertT 0 ;;L) Of course it can be implemented in pure recursive manner without using folding as well.

1700. 12.6. FINGER TREE 363 The last case is the most straightforward one. If none of F, M, and R is empty, it adds the size of these three part and construct the tree along with this size information by calling tree(s; F;M;R) function. If both the middle part inner tree and one of the

1701. nger is empty, the algorithm repeatedly insert all elements stored in the other

1702. nger to an empty tree, so that the result is constructed from a list of tree nodes. If the middle part inner tree isn't empty, and one of the

1703. nger is empty, the algorithm `borrows' one tree node from the middle part, either by extracting from head if front

1704. nger is empty or removing from tail if rear

1705. nger is empty. Then the algorithm unwraps the `borrowed' tree node to a list, and recursively call tree0() function to construct the result. This algorithm can be translated to the following Haskell code for example. tree f Empty [] = foldr cons' Empty f tree [] Empty r = foldr cons' Empty r tree [] m r = let (f, m') = uncons' m in tree (unwraps f) m' r tree f m [] = let (m', r) = unsnoc' m in tree f m' (unwraps r) tree f m r = Tr (sizeL f + sizeT m + sizeL r) f m r Function tree0() helps to minimize the modi

1706. cation. insertT 0() can be real-ized by using it like the following. insertT 0 (x; T) = 8 : leaf(x) : T = tree0(fxg;; fyg) : T = leaf(x) tree0(fx; x1g; insertT 0(wraps(fx2; x3; x4g);M);R) : T = tree(s; fx1; x2; x3; x4g;M;R) tree0(fxg [ F;M;R) : otherwise (12.44) And it's corresponding Haskell code is a line by line translation. cons' a Empty = Lf a cons' a (Lf b) = tree [a] Empty [b] cons' a (Tr _ [b, c, d, e] m r) = tree [a, b] (cons' (wraps [c, d, e]) m) r cons' a (Tr _ f m r) = tree (a:f) m r The similar modi

1707. cation for augment size should also be tuned for imperative algorithms, for example, when a new node is prepend to the head of the

1708. nger tree, we should update size when traverse the tree. function Prepend-Node(n; T) r Tree() p r Connect-Mid(p; T) while Full?(Front(T)) do F Front(T) Front(T) fn; F[1]g Size(T) Size(T) + Size(n) . update size n Node() Children(n) F[2::] p T T Mid(T) if T = NIL then T Tree() Front(T) fng else if j Front(T) j = 1 ^ Rear(T) = then

1709. 364 CHAPTER 12. SEQUENCES, THE LAST BRICK Rear(T) Front(T) Front(T) fng else Front(T) fng[ Front(T) Size(T) Size(T) + Size(n) . update size Connect-Mid(p; T) T return Flat(r) The corresponding Python code are modi

1710. ed accordingly as below. def prepend_node(n, t): root = prev = Tree() prev.set_mid(t) while frontFull(t): f = t.front t.front = [n] + f[:1] t.size = t.size + n.size n = wraps(f[1:]) prev = t t = t.mid if t is None: t = leaf(n) elif len(t.front)==1 and t.rear == []: t = Tree(n.size + t.size, [n], None, t.front) else: t = Tree(n.size + t.size, [n]+t.front, t.mid, t.rear) prev.set_mid(t) return flat(root) Note that the tree constructor is also modi

1711. ed to take a size argument as the

1712. rst parameter. And the leaf() helper function does not only construct the tree from a node, but also set the size of the tree with the same size of the node inside it. For simpli

1713. cation purpose, we skip the detailed description of what are mod-i

1714. ed in extractT ()0, appendT (), removeT (), and concat() algorithms. They are left as exercises to the reader. Split a

1715. nger tree at a given position With size information augmented, it's easy to locate a node at given position by performing a tree search. What's more, as the

1716. nger tree is constructed from three part F, M, and R; and it's nature of recursive, it's also possible to split it into three sub parts with a given position i: the left, the node at i, and the right part. The idea is straight forward. Since we have the size information for F, M, and R. Denote these three sizes as Sf , Sm, and Sr. if the given position i Sf , the node must be stored in F, we can go on seeking the node inside F; if Sf i Sf + Sm, the node must be stored in M, we need recursively perform search in M; otherwise, the node should be in R, we need search inside R. If we skip the error handling of trying to split an empty tree, there is only one edge case as below.

1717. 12.6. FINGER TREE 365 splitAt(i; T) = (; x; ) : T = leaf(x) ::: : otherwise Splitting a leaf results both the left and right parts empty, the node stored in leaf is the resulting node. The recursive case handles the three sub cases by comparing i with the sizes. Suppose function splitAtL(i;L) splits a list of nodes at given position i into three parts: (A; x;B) = splitAtL(i;L), where x is the i-th node in L, A is a sub list contains all nodes before position i, and B is a sub list contains all rest nodes after i. splitAt(i; T) = 8 : (; x; ) : T = leaf(x) (fromL(A); x; tree0(B;M;R) : i Sf ; (A; x;B) = splitAtL(i; F) (tree0(F;Ml;A); x; tree0(B;Mr;R) : Sf i Sf + Sm (tree0(F;M;A); x; fromL(B)) : otherwise; (A; x;B) = splitAtL(i Sf Sm;R) (12.45) Where Ml; x;Mr; A;B in the thrid case are calculated as the following. (Ml; t;Mr) = splitAt(i Sf ;M) (A; x;B) = splitAtL(i Sf size(Ml); unwraps(t)) And the function splitAtL() is just a linear traverse, since the length of list is limited not to exceed the constraint of 2-3 tree, the performance is still ensured to be constant O(1) time. Denote L = fx1; x2; :::g and L0 = fx2; x3; :::g. splitAtL(i;L) = 8 : (; x1; ) : i = 0 ^ L = fx1g (; x1;L0) : i size0(x1) (fx1g [ A; x;B) : otherwise; (A; x;B) = splitAtL(i size0(x1);L0) (12.46) The solution of splitting is a typical divide and conquer strategy. The per-formance of this algorithm is determined by the recursive case of searching in middle part inner tree. Other cases are all constant time as we've analyzed. The depth of recursion is proportion to the height of the tree h, so the algorithm is bound to O(h). Because the tree is well balanced (by using 2-3 tree, and all the insertion/removal algorithms keep the tree balanced), so h = O(lgN) where N is the number of elements stored in

1718. nger tree. The overall performance of splitting is O(lgN). Let's

1719. rst give the Haskell program for splitAtL() function splitNodesAt 0 [x] = ([], x, []) splitNodesAt i (x:xs) j i size x = ([], x, xs) j otherwise = let (xs', y, ys) = splitNodesAt (i-size x) xs in (x:xs', y, ys) Then the program for splitAt(), as there is already function de

1720. ned in stan-dard library with this name, we slightly change the name by adding a apostro-phe. splitAt' _ (Lf x) = (Empty, x, Empty) splitAt' i (Tr _ f m r) j i szf = let (xs, y, ys) = splitNodesAt i f

1721. 366 CHAPTER 12. SEQUENCES, THE LAST BRICK in ((foldr cons' Empty xs), y, tree ys m r) j i szf + szm = let (m1, t, m2) = splitAt' (i-szf) m (xs, y, ys) = splitNodesAt (i-szf - sizeT m1) (unwraps t) in (tree f m1 xs, y, tree ys m2 r) j otherwise = let (xs, y, ys) = splitNodesAt (i-szf -szm) r in (tree f m xs, y, foldr cons' Empty ys) where szf = sizeL f szm = sizeT m Random access With the help of splitting at any arbitrary position, it's trivial to realize random access in O(lgN) time. Denote function mid(x) returns the 2-nd element of a tuple, left(x), and right(x) return the

1722. rst element and the 3-rd element of the tuple respectively. getAt(S; i) = unwrap(mid(splitAt(i; S))) (12.47) It

1723. rst splits the sequence at position i, then unwraps the node to get the el-ement stored inside it. When mutate the i-th element of sequence S represented by

1724. nger tree, we

1725. rst split it at i, then we replace the middle to what we want to mutate, and re-construct them to one

1726. nger tree by using concatenation. setAt(S; i; x) = concat(L; insertT (x;R)) (12.48) where (L; y;R) = splitAt(i; S) What's more, we can also realize a removeAt(S; i) function, which can re-move the i-th element from sequence S. The idea is

1727. rst to split at i, unwrap and return the element of the i-th node; then concatenate the left and right to a new

1728. nger tree. removeAt(S; i) = (unwrap(y); concat(L;R)) (12.49) These handy algorithms can be translated to the following Haskell program. getAt t i = unwrap x where (_, x, _) = splitAt' i t setAt t i x = let (l, _, r) = splitAt' i t in concat' l (cons x r) removeAt t i = let (l, x, r) = splitAt' i t in (unwrap x, concat' l r) Imperative random access As we can directly mutate the tree in imperative settings, it's possible to realize Get-At(T; i) and Set-At(T; i; x) without using splitting. The idea is

1729. rstly implement a algorithm which can apply some operation to a given position. The following algorithm takes three arguments, a

1730. nger tree T, a position index at i which ranges from zero to the number of elements stored in the tree, and a function f, which will be applied to the element at i. function Apply-At(T; i; f) while Size(T) 1 do Sf Size-Nodes(Front(T))

1731. 12.6. FINGER TREE 367 Sm Size-Tr(Mid(T)) if i Sf then return Lookup-Nodes(Front(T), i, f) else if i Sf + Sm then T Mid(T) i i Sf else return Lookup-Nodes(Rear(T), i Sf Sm, f) n First-Lf(T) x Elem(n) Elem(n) f(x) return x This algorithm is essentially a divide and conquer tree search. It repeatedly examine the current tree till reach a tree with size of 1 (can it be determined as a leaf? please consider the ill-formed case and refer to the exercise later). Every time, it checks the position to be located with the size information of front

1732. nger and middle part inner tree. If the index i is less than the size of front

1733. nger, the location is at some node in it. The algorithm call a sub procedure to look-up among front

1734. nger; If the index is between the size of front

1735. nger and the total size till middle part inner tree, it means that the location is at some node inside the middle, the algorithm goes on traverse along the middle part inner tree with an updated index reduced by the size of front

1736. nger; Otherwise it means the location is at some node in rear

1737. nger, the similar looking up procedure is called accordingly. After this loop, we've got a node, (can be a compound node) with what we are looking for at the

1738. rst leaf inside this node. We can extract the element out, and apply the function f on it and store the new value back. The algorithm returns the previous element before applying f as the

1739. nal result. What hasn't been factored is the algorithm Lookup-Nodes(L, i, f). It takes a list of nodes, a position index, and a function to be applied. This algorithm can be implemented by checking every node in the list. If the node is a leaf, and the index is zero, we are at the right position to be looked up. The function can be applied on the element stored in this leaf, and the previous value is returned; Otherwise, we need compare the size of this node and the index to determine if the position is inside this node and search inside the children of the node if necessary. function Lookup-Nodes(L; i; f) loop for 8n 2 L do if n is leaf ^i = 0 then x Elem(n) Elem(n) f(x) return x if i Size(n) then L Children(n) break i i Size(n) The following are the corresponding Python code implements the algorithms.

1740. 368 CHAPTER 12. SEQUENCES, THE LAST BRICK def applyAt(t, i, f): while t.size 1: szf = sizeNs(t.front) szm = sizeT(t.mid) if i szf: return lookupNs(t.front, i, f) elif i szf + szm: t = t.mid i = i - szf else: return lookupNs(t.rear, i - szf - szm, f) n = first_leaf(t) x = elem(n) n.children[0] = f(x) return x def lookupNs(ns, i, f): while True: for n in ns: if n.leaf and i == 0: x = elem(n) n.children[0] = f(x) return x if i n.size: ns = n.children break i = i - n.size With auxiliary algorithm that can apply function at a given position, it's trivial to implement the Get-At() and Set-At() by passing special function for applying. function Get-At(T; i) return Apply-At(T; i; x:x) function Set-At(T; i; x) return Apply-At(T; i; y:x) That is we pass id function to implement getting element at a position, which doesn't change anything at all; and pass constant function to implement setting, which set the element to new value by ignoring its previous value. Imperative splitting It's not enough to just realizing Apply-At algorithm in imperative settings, this is because removing element at arbitrary position is also a typical case. Almost all the imperative

1741. nger tree algorithms so far are kind of one-pass top-down manner. Although we sometimes need to book keeping the root. It means that we can even realize all of them without using the parent

1742. eld. Splitting operation, however, can be easily implemented by using parent

1743. eld. We can

1744. rst perform a top-down traverse along with the middle part inner tree as long as the splitting position doesn't located in front or rear

1745. nger. After that, we need a bottom-up traverse along with the parent

1746. eld of the two split trees to

1747. ll out the necessary

1748. elds.

1749. 12.6. FINGER TREE 369 function Split-At(T; i) T1 Tree() T2 Tree() while Sf i Sf + Sm do . Top-down pass T0 1 Tree() T0 2 Tree() Front(T0 1) Front(T) Rear(T0 2) Rear(T) Connect-Mid(T1; T0 1) Connect-Mid(T2; T0 2) T1 T0 1 T2 T0 2 i i Sf T Mid(T) if i Sf then (X; n; Y ) Split-Nodes(Front(T), i) T0 1 From-Nodes(X) T0 2 T Size(T0 2) Size(T) - Size-Nodes(X) - Size(n) Front(T0 2) Y else if Sf + Sm i then (X; n; Y ) Split-Nodes(Rear(T), i Sf Sm) T0 2 From-Nodes(Y ) T0 1 T Size(T0 1) Size(T) - Size-Nodes(Y ) - Size(n) Rear(T0 1) X Connect-Mid(T1; T0 1) Connect-Mid(T2; T0 2) i i Size-Tr(T0 1) while n is NOT leaf do . Bottom-up pass (X; n; Y ) Split-Nodes(Children(n), i) i i Size-Nodes(X) Rear(T1) X Front(T2) Y Size(T1) Sum-Sizes(T1) Size(T2) Sum-Sizes(T2) T1 Parent(T1) T2 Parent(T2) return (Flat(T1), Elem(n), Flat(T2)) The algorithm

1750. rst creates two trees T1 and T2 to hold the split results. Note that they are created as 'ground' trees which are parents of the roots. The

1751. rst pass is a top-down pass. Suppose Sf , and Sm retrieve the size of the front

1752. nger and the size of middle part inner tree respectively. If the position at which the tree to be split is located at middle part inner tree, we reuse the front

1753. nger of T for new created T0 1, and reuse rear

1754. nger of T for T0 2. At this time point, we can't

1755. ll the other

1756. elds for T0 1 and T0 2, they are left empty, and we'll

1757. nish

1758. lling them in the future. After that, we connect T1 and T0 1 so the latter becomes the middle part inner tree of the former. The similar connection is done for T2 and T0 2 as well. Finally, we update the position by deducing it by the size of front

1759. 370 CHAPTER 12. SEQUENCES, THE LAST BRICK

1760. nger, and go on traversing along with the middle part inner tree. When the

1761. rst pass

1762. nishes, we are at a position that either the splitting should be performed in front

1763. nger, or in rear

1764. nger. Splitting the nodes in

1765. nger results a tuple, that the

1766. rst part and the third part are lists before and after the splitting point, while the second part is a node contains the element at the original position to be split. As both

1767. ngers hold at most 3 nodes because they are actually 2-3 trees, the nodes splitting algorithm can be performed by a linear search. function Split-Nodes(L; i) for j 2 [1; Length(L) ] do if i Size(L[j]) then return (L[1:::j 1], L[j], L[j + 1::: Length(L) ]) i i Size(L[j]) We next create two new result trees T0 1 and T0 2 from this tuple, and connected them as the

1768. nal middle part inner tree of T1 and T2. Next we need perform a bottom-up traverse along with the result trees to

1769. ll out all the empty information we skipped in the

1770. rst pass. We loop on the second part of the tuple, the node, till it becomes a leaf. In each iteration, we repeatedly splitting the children of the node with an updated position i. The

1771. rst list of nodes returned from splitting is used to

1772. ll the rear

1773. nger of T1; and the other list of nodes is used to

1774. ll the front

1775. nger of T2. After that, since all the three parts of a

1776. nger tree { the front and rear

1777. nger, and the middle part inner tree { are

1778. lled, we can then calculate the size of the tree by summing these three parts up. function Sum-Sizes(T) return Size-Nodes(Front(T)) + Size-Tr(Mid(T)) + Size-Nodes(Rear(T)) Next, the iteration goes on along with the parent

1779. elds of T1 and T2. The last 'black-box' algorithm is From-Nodes(L), which can create a

1780. nger tree from a list of nodes. It can be easily realized by repeatedly perform insertion on an empty tree. The implementation is left as an exercise to the reader. The example Python code for splitting is given as below. def splitAt(t, i): (t1, t2) = (Tree(), Tree()) while szf(t) i and i szf(t) + szm(t): fst = Tree(0, t.front, None, []) snd = Tree(0, [], None, t.rear) t1.set_mid(fst) t2.set_mid(snd) (t1, t2) = (fst, snd) i = i - szf(t) t = t.mid if i szf(t): (xs, n, ys) = splitNs(t.front, i) sz = t.size - sizeNs(xs) - n.size (fst, snd) = (fromNodes(xs), Tree(sz, ys, t.mid, t.rear)) elif szf(t) + szm(t) i: (xs, n, ys) = splitNs(t.rear, i - szf(t) - szm(t)) sz = t.size - sizeNs(ys) - n.size

1781. 12.7. NOTES AND SHORT SUMMARY 371 (fst, snd) = (Tree(sz, t.front, t.mid, xs), fromNodes(ys)) t1.set_mid(fst) t2.set_mid(snd) i = i - sizeT(fst) while not n.leaf: (xs, n, ys) = splitNs(n.children, i) i = i - sizeNs(xs) (t1.rear, t2.front) = (xs, ys) t1.size = sizeNs(t1.front) + sizeT(t1.mid) + sizeNs(t1.rear) t2.size = sizeNs(t2.front) + sizeT(t2.mid) + sizeNs(t2.rear) (t1, t2) = (t1.parent, t2.parent) return (flat(t1), elem(n), flat(t2)) The program to split a list of nodes at a given position is listed like this. def splitNs(ns, i): for j in range(len(ns)): if i ns[j].size: return (ns[:j], ns[j], ns[j+1:]) i = i - ns[j].size With splitting de

1782. ned, removing an element at arbitrary position can be realized trivially by

1783. rst performing a splitting, then concatenating the two result tree to one big tree and return the element at that position. function Remove-At(T; i) (T1; x; T2) Split-At(T; i) return (x; Concat(T1; T2) ) Exercise 12.6 1. Another way to realize insertT 0() is to force increasing the size

1784. eld by one, so that we needn't write function tree0(). Try to realize the algorithm by using this idea. 2. Try to handle the augment size information as well as in insertT 0() al-gorithm for the following algorithms (both functional and imperative): extractT ()0, appendT (), removeT (), and concat(). The head, tail, init and last functions should be kept unchanged. Don't refer to the download-able programs along with this book before you take a try. 3. In the imperative Apply-At() algorithm, it tests if the size of the current tree is greater than one. Why don't we test if the current tree is a leaf? Tell the dierence between these two approaches. 4. Implement the From-Nodes(L) in your favorite imperative programming language. You can either use looping or create a folding-from-right sub algorithm. 12.7 Notes and short summary Although we haven't been able to give a purely functional realization to match the O(1) constant time random access as arrays in imperative settings. The

1785. 372 CHAPTER 12. SEQUENCES, THE LAST BRICK result

1786. nger tree data structure achieves an overall well performed sequence. It manipulates fast in amortized O(1) time both on head an on tail, it can also concatenates two sequence in logarithmic time as well as break one sequence into two sub sequences at any position. While neither arrays in imperative settings nor linked-list in functional settings satis

1787. es all these goals. Some functional programming languages adopt this sequence realization in its standard library [7]. Just as the title of this chapter, we've presented the last corner stone of ele-mentary data structures in both functional and imperative settings. We needn't concern about being lack of elementary data structures when solve problems with some typical algorithms. For example, when writing a MTF (move-to-front) encoding algorithm[8], with the help of the sequence data structure explained in this chapter. We can implement it quite straightforward. mtf(S; i) = fxg [ S 0 where (x; S0) = removeAt(S; i). In the next following chapters, we'll

1788. rst explains some typical divide and conquer sorting methods, including quick sort, merge sort and their variants; then some elementary searching algorithms, and string matching algorithms will be covered.

1789. Bibliography [1] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [2] Chris Okasaki. Purely Functional Random-Access Lists. Functional Pro-gramming Languages and Computer Architecture, June 1995, pages 86-95. [3] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. [4] Miran Lipovaca. Learn You a Haskell for Great Good! A Beginner's Guide. No Starch Press; 1 edition April 2011, 400 pp. ISBN: 978-1-59327- 283-8 [5] Ralf Hinze and Ross Paterson. Finger Trees: A Simple General-purpose Data Structure. in Journal of Functional Programming16:2 (2006), pages 197-217. http://www.soi.city.ac.uk/ ross/papers/FingerTree.html [6] Guibas, L. J., McCreight, E. M., Plass, M. F., Roberts, J. R. (1977), A new representation for linear lists. Conference Record of the Ninth Annual ACM Symposium on Theory of Computing, pp. 49C60. [7] Generic

1790. nger-tree structure. http://hackage.haskell.org/packages/archive/

1791. ngertree/0.0/doc/html/Data- FingerTree.html [8] Wikipedia. Move-to-front transform. http://en.wikipedia.org/wiki/Move-to- front transform 373

1793. Part V Sorting and Searching 375

1795. Chapter 13 Divide and conquer, Quick sort vs. Merge sort 13.1 Introduction It's proved that the best approximate performance of comparison based sorting is O(n lg n) [1]. In this chapter, two divide and conquer sorting algorithms are introduced. Both of them perform in O(n lg n) time. One is quick sort. It is the most popular sorting algorithm. Quick sort has been well studied, many programming libraries provide sorting tools based on quick sort. In this chapter, we'll

1796. rst introduce the idea of quick sort, which demon-strates the power of divide and conquer strategy well. Several variants will be explained, and we'll see when quick sort performs poor in some special cases. That the algorithm is not able to partition the sequence in balance. In order to solve the unbalanced partition problem, we'll next introduce about merge sort, which ensure the sequence to be well partitioned in all the cases. Some variants of merge sort, including nature merge sort, bottom-up merge sort are shown as well. Same as other chapters, all the algorithm will be realized in both imperative and functional approaches. 13.2 Quick sort Consider a teacher arranges a group of kids in kindergarten to stand in a line for some game. The kids need stand in order of their heights, that the shortest one stands on the left most, while the tallest stands on the right most. How can the teacher instruct these kids, so that they can stand in a line by themselves? There are many strategies, and the quick sort approach can be applied here: 1. The

1797. rst kid raises his/her hand. The kids who are shorter than him/her stands to the left to this child; the kids who are taller than him/her stands to the right of this child; 2. All the kids move to the left, if there are, repeat the above step; all the kids move to the right repeat the same step as well. 377

1798. 378CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT Figure 13.1: Instruct kids to stand in a line Suppose a group of kids with their heights as f102; 100; 98; 95; 96; 99; 101; 97g with [cm] as the unit. The following table illustrate how they stand in order of height by following this method. 102 100 98 95 96 99 101 97 100 98 95 96 99 101 97 102 98 95 96 99 97 100 101 102 95 96 97 98 99 100 101 102 95 96 97 98 99 100 101 102 95 96 97 98 99 100 101 102 95 96 97 98 99 100 101 102 At the beginning, the

1799. rst child with height 102 cm raises his/her hand. We call this kid the pivot and mark this height in bold. It happens that this is the tallest kid. So all others stands to the left side, which is represented in the second row in the above table. Note that the child with height 102 cm is in the

1800. nal ordered position, thus we mark it italic. Next the kid with height 100 cm raise hand, so the children of heights 98, 95, 96 and 99 cm stand to his/her left, and there is only 1 child of height 101 cm who is taller than this pivot kid. So he stands to the right hand. The 3rd row in the table shows this stage accordingly. After that, the child of 98 cm high is selected as pivot on left hand; while the child of 101 cm high on the right is selected as pivot. Since there are no other kids in the unsorted group with 101 cm as pivot, this small group is ordered already and the kid of height 101 cm is in the

1801. nal proper position. The same method is applied to the group of kids which haven't been in correct order until all of them are stands in the

1802. nal position. 13.2.1 Basic version Summarize the above instruction leads to the recursive description of quick sort. In order to sort a sequence of elements L. If L is empty, the result is obviously empty; This is the trivial edge case; Otherwise, select an arbitrary element in L as a pivot, recursively sort all elements not greater than L, put the result on the left hand of the pivot, and recursively sort all elements which are greater than L, put the result on the right hand of the pivot.

1803. 13.2. QUICK SORT 379 Note that the emphasized word and, we don't use `then' here, which indicates it's quite OK that the recursive sort on the left and right can be done in parallel. We'll return this parallelism topic soon. Quick sort was

1804. rst developed by C. A. R. Hoare in 1960 [1], [15]. What we describe here is a basic version. Note that it doesn't state how to select the pivot. We'll see soon that the pivot selection aects the performance of quick sort dramatically. The most simple method to select the pivot is always choose the

1805. rst one so that quick sort can be formalized as the following. sort(L) = : L = sort(fxjx 2 L0; x l1g) [ fl1g [ sort(fxjx 2 L0; l1 xg) : otherwise (13.1) Where l1 is the

1806. rst element of the non-empty list L, and L0 contains the rest elements fl2; l3; :::g. Note that we use Zermelo Frankel expression (ZF ex-pression for short), which is also known as list comprehension. A ZF expression faja 2 S; p1(a); p2(a); :::g means taking all element in set S, if it satis

1807. es all the predication p1; p2; :::. ZF expression is originally used for representing set, we extend it to express list for the sake of brevity. There can be duplicated elements, and dierent permutations represent for dierent list. Please refer to the appendix about list in this book for detail. It's quite straightforward to translate this equation to real code if list com-prehension is supported. The following Haskell code is given for example: sort [] = [] sort (x:xs) = sort [y j y xs, y x] ++ [x] ++ sort [y j y xs, x y] This might be the shortest quick sort program in the world at the time when this book is written. Even a verbose version is still very expressive: sort [] = [] sort (x:xs) = as ++ [x] ++ bs where as = sort [ a j a xs, a x] bs = sort [ b j b xs, x b] There are some variants of this basic quick sort program, such as using explicit

1808. ltering instead of list comprehension. The following Python program demonstrates this for example: def sort(xs): if xs == []: return [] pivot = xs[0] as = sort(filter(lambda x : x pivot, xs[1:])) bs = sort(filter(lambda x : pivot x, xs[1:])) return as + [pivot] + bs 13.2.2 Strict weak ordering We assume the elements are sorted in monotonic none decreasing order so far. It's quite possible to customize the algorithm, so that it can sort the elements in other ordering criteria. This is necessary in practice because users may sort numbers, strings, or other complex objects (even list of lists for example).

1809. 380CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT The typical generic solution is to abstract the comparison as a parameter as we mentioned in chapters about insertion sort and selection sort. Although it needn't the total ordering, the comparison must satisfy strict weak ordering at least [17] [16]. For the sake of brevity, we only considering sort the elements by using less than or equal (equivalent to not greater than) in the rest of the chapter. 13.2.3 Partition Observing that the basic version actually takes two passes to

1810. nd all elements which are greater than the pivot as well as to

1811. nd the others which are not respectively. Such partition can be accomplished by only one pass. We explicitly de

1812. ne the partition as below. partition(p;L) = 8 : (; ) : L = (fl1g [ A;B) : p(l1); (A;B) = partition(p;L0) (A; fl1g [ B) : :p(l1) (13.2) Note that the operation fxg [ L is just a `cons' operation, which only takes constant time. The quick sort can be modi

1813. ed accordingly. sort(L) = : L = sort(A) [ fl1g [ sort(B) : otherwise; (A;B) = partition(xx l1;L0) (13.3) Translating this new algorithm into Haskell yields the below code. sort [] = [] sort (x:xs) = sort as ++ [x] ++ sort bs where (as, bs) = partition ( x) xs partition _ [] = ([], []) partition p (x:xs) = let (as, bs) = partition p xs in if p x then (x:as, bs) else (as, x:bs) The concept of partition is very critical to quick sort. Partition is also very important to many other sort algorithms. We'll explain how it generally aects the sorting methodology by the end of this chapter. Before further discussion about

1814. ne tuning of quick sort speci

1815. c partition, let's see how to realize it in-place imperatively. There are many partition methods. The one given by Nico Lomuto [4] [2] will be used here as it's easy to understand. We'll show other partition algorithms soon and see how partitioning aects the performance. Figure 13.2 shows the idea of this one-pass partition method. The array is processed from left to right. At any time, the array consists of the following parts as shown in

1816. gure 13.2 (a): The left most cell contains the pivot; By the end of the partition process, the pivot will be moved to the

1817. nal proper position; A segment contains all elements which are not greater than the pivot. The right boundary of this segment is marked as `left';

1818. 13.2. QUICK SORT 381 pivot left right x[l] ...not greater than ... ... greater than ... ...?...x[u] (a) Partition invariant pivot left right x[l] x[l+1] ...?...x[u] (b) Start pivot left right x[l] ...not greater than ... x[left] ... greater than ...x[u] swap (c) Finish Figure 13.2: Partition a range of array by using the left most element as pivot. A segment contains all elements which are greater than the pivot. The right boundary of this segment is marked as `right'; It means that elements between `left' and `right' marks are greater than the pivot; The rest of elements after `right' mark haven't been processed yet. They may be greater than the pivot or not. At the beginning of partition, the `left' mark points to the pivot and the `right' mark points to the the second element next to the pivot in the array as in Figure 13.2 (b); Then the algorithm repeatedly advances the right mark one element after the other till passes the end of the array. In every iteration, the element pointed by the `right' mark is compared with the pivot. If it is greater than the pivot, it should be among the segment between the `left' and `right' marks, so that the algorithm goes on to advance the `right' mark and examine the next element; Otherwise, since the element pointed by `right' mark is less than or equal to the pivot (not greater than), it should be put before the `left' mark. In order to achieve this, the `left' mark needs be advanced by one, then exchange the elements pointed by the `left' and `right' marks. Once the `right' mark passes the last element, it means that all the elements have been processed. The elements which are greater than the pivot have been moved to the right hand of `left' mark while the others are to the left hand of this mark. Note that the pivot should move between the two segments. An extra exchanging between the pivot and the element pointed by `left' mark makes this

1819. nal one to the correct location. This is shown by the swap bi-directional arrow

1820. 382CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT in

1821. gure 13.2 (c). The `left' mark (which points the pivot

1822. nally) partitions the whole array into two parts, it is returned as the result. We typically increase the `left' mark by one, so that it points to the

1823. rst element greater than the pivot for convenient. Note that the array is modi

1824. ed in-place. The partition algorithm can be described as the following. It takes three arguments, the array A, the lower and the upper bound to be partitioned 1. 1: function Partition(A, l, u) 2: p A[l] . the pivot 3: L l . the left mark 4: for R 2 [l + 1; u] do . iterate on the right mark 5: if :(p A[R]) then . negate of is enough for strict weak order 6: L L + 1 7: Exchange A[L] $ A[R] 8: Exchange A[L] $ p 9: return L + 1 . The partition position Below table shows the steps of partitioning the array f3; 2; 5; 4; 0; 1; 6; 7g. (l) 3 (r) 2 5 4 0 1 6 7 initialize, pivot = 3; l = 1; r = 2 3 (l)(r) 2 5 4 0 1 6 7 2 3, advances l, (r = l) 3 (l) 2 (r) 5 4 0 1 6 7 5 3, moves on 3 (l) 2 5 (r) 4 0 1 6 7 4 3, moves on 3 (l) 2 5 4 (r) 0 1 6 7 0 3 3 2 (l) 0 4 (r) 5 1 6 7 Advances l, then swap with r 3 2 (l) 0 4 5 (r) 1 6 7 1 3 3 2 0 (l) 1 5 (r) 4 6 7 Advances l, then swap with r 3 2 0 (l) 1 5 4 (r) 6 7 6 3, moves on 3 2 0 (l) 1 5 4 6 (r) 7 7 3, moves on 1 2 0 3 (l+1) 5 4 6 7 r passes the end, swap pivot and l This version of partition algorithm can be implemented in ANSI C as the following. int partition(Key xs, int l, int u) { int pivot, r; for (pivot = l, r = l + 1; r u; ++r) if (!(xs[pivot] xs[r])) { ++l; swap(xs[l], xs[r]); } swap(xs[pivot], xs[l]); return l + 1; } Where swap(a, b) can either be de

1825. ned as function or a macro. In ISO C++, swap(a, b) is provided as a function template. the type of the elements can be de

1826. ned somewhere or abstracted as a template parameter in ISO C++. We omit these language speci

1827. c details here. With the in-place partition realized, the imperative in-place quick sort can be accomplished by using it. 1The partition algorithm used here is slightly dierent from the one in [2]. The latter uses the last element in the slice as the pivot.

1828. 13.2. QUICK SORT 383 1: procedure Quick-Sort(A; l; u) 2: if l u then 3: m Partition(A; l; u) 4: Quick-Sort(A; l;m 1) 5: Quick-Sort(A; m; u) When sort an array, this procedure is called by passing the whole range as the lower and upper bounds. Quick-Sort(A; 1; jAj). Note that when l u it means the array slice is either empty, or just contains only one element, both can be treated as ordered, so the algorithm does nothing in such cases. Below ANSI C example program completes the basic in-place quick sort. void quicksort(Key xs, int l, int u) { int m; if (l u) { m = partition(xs, l, u); quicksort(xs, l, m - 1); quicksort(xs, m, u); } } 13.2.4 Minor improvement in functional partition Before exploring how to improve the partition for basic version quick sort, it's obviously that the one presented so far can be de

1829. ned by using folding. Please refer to the appendix A of this book for de

1830. nition of folding. partition(p;L) = fold(f(p); (; );L) (13.4) Where function f compares the element to the pivot with predicate p (which is passed to f as a parameter, so that f is in curried form, see appendix A for detail. Alternatively, f can be a lexical closure which is in the scope of partition, so that it can access the predicate in this scope.), and update the result pair accordingly. f(p; x; (A;B)) = (fxg [ A;B) : p(x) (A; fxg [ B) : otherwise(:p(x)) (13.5) Note we actually use pattern-matching style de

1831. nition. In environment with-out pattern-matching support, the pair (A;B) should be represented by a vari-able, for example P, and use access functions to extract its

1832. rst and second parts. The example Haskell program needs to be modi

1833. ed accordingly. sort [] = [] sort (x:xs) = sort small ++ [x] ++ sort big where (small, big) = foldr f ([], []) xs f a (as, bs) = if a x then (a:as, bs) else (as, a:bs) Accumulated partition The partition algorithm by using folding actually accumulates to the result pair of lists (A;B). That if the element is not greater than the pivot, it's accumulated

1834. 384CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT to A, otherwise to B. We can explicitly express it which save spaces and is friendly for tail-recursive call optimization (refer to the appendix A of this book for detail). partition(p; L; A;B) = 8 : (A;B) : L = partition(p;L0; fl1g [ A;B) : p(l1) partition(p;L0; A; fl1g [ B) : otherwise (13.6) Where l1 is the

1835. rst element in L if L isn't empty, and L0 contains the rest elements except for l1, that L0 = fl2; l3; :::g for example. The quick sort algorithm then uses this accumulated partition function by passing the xx pivot as the partition predicate. sort(L) = : L = sort(A) [ fl1g [ sort(B) : otherwise (13.7) Where A;B are computed by the accumulated partition function de

1836. ned above. (A;B) = partition(xx l1;L 0 ;; ) Accumulated quick sort Observe the recursive case in the last quick sort de

1837. nition. the list concatenation operations sort(A) [ fl1g [ sort(B) actually are proportion to the length of the list to be concatenated. Of course we can use some general solutions introduced in the appendix A of this book to improve it. Another way is to change the sort algorithm to accumulated manner. Something like below: 0 (L; S) = sort S : L = ::: : otherwise Where S is the accumulator, and we call this version by passing empty list as the accumulator to start sorting: sort(L) = sort0(L; ). The key intuitive is that after the partition

1838. nishes, the two sub lists need to be recursively sorted. We can

1839. rst recursively sort the list contains the elements which are greater than the pivot, then link the pivot in front of it and use it as an accumulator for next step sorting. Based on this idea, the '...' part in above de

1840. nition can be realized as the following. sort 0 (L; S) = S : L = sort(A; fl1g [ sort(B; ?)) : otherwise The problem is what's the accumulator when sorting B. There is an impor-tant invariant actually, that at every time, the accumulator S holds the elements have been sorted so far. So that we should sort B by accumulating to S. sort 0 (L; S) = S : L = sort(A; fl1g [ sort(B; S)) : otherwise (13.8) The following Haskell example program implements the accumulated quick sort algorithm.

1841. 13.3. PERFORMANCE ANALYSIS FOR QUICK SORT 385 asort xs = asort' xs [] asort' [] acc = acc asort' (x:xs) acc = asort' as (x:asort' bs acc) where (as, bs) = part xs [] [] part [] as bs = (as, bs) part (y:ys) as bs j y x = part ys (y:as) bs j otherwise = part ys as (y:bs) Exercise 13.1 Implement the recursive basic quick sort algorithm in your favorite imper-ative programming language. Same as the imperative algorithm, one minor improvement is that besides the empty case, we needn't sort the singleton list, implement this idea in the functional algorithm as well. The accumulated quick sort algorithm developed in this section uses inter-mediate variable A;B. They can be eliminated by de

1842. ning the partition function to mutually recursive call the sort function. Implement this idea in your favorite functional programming language. Please don't refer to the downloadable example program along with this book before you try it. 13.3 Performance analysis for quick sort Quick sort performs well in practice, however, it's not easy to give theoretical analysis. It needs the tool of probability to prove the average case performance. Nevertheless, it's intuitive to calculate the best case and worst case perfor-mance. It's obviously that the best case happens when every partition divides the sequence into two slices with equal size. Thus it takes O(lg n) recursive calls as shown in

1843. gure 13.3. There are total O(lg n) levels of recursion. In the

1844. rst level, it executes one partition, which processes n elements; In the second level, it executes partition two times, each processes n=2 elements, so the total time in the second level bounds to 2O(n=2) = O(n) as well. In the third level, it executes partition four times, each processes n=4 elements. The total time in the third level is also bound to O(n); ... In the last level, there are n small slices each contains a single element, the time is bound to O(n). Summing all the time in each level gives the total performance of quick sort in best case as O(n lg n). However, in the worst case, the partition process unluckily divides the se-quence to two slices with unbalanced lengths in most time. That one slices with length O(1), the other is O(n). Thus the recursive time degrades to O(n). If we draw a similar

1845. gure, unlike in the best case, which forms a balanced binary tree, the worst case degrades into a very unbalanced tree that every node has only one child, while the other is empty. The binary tree turns to be a linked list with O(n) length. And in every level, all the elements are processed, so the total performance in worst case is O(n2), which is as same poor as insertion sort and selection sort.

1846. 386CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT n n / 2 n / 2 n /4 n /4 n /4 n /4 ...lg(n)... 1 1 ...n... 1 Figure 13.3: In the best case, quick sort divides the sequence into two slices with same length. Let's consider when the worst case will happen. One special case is that all the elements (or most of the elements) are same. Nico Lomuto's partition method deals with such sequence poor. We'll see how to solve this problem by introducing other partition algorithm in the next section. The other two obvious cases which lead to worst case happen when the sequence has already in ascending or descending order. Partition the ascending sequence makes an empty sub list before the pivot, while the list after the pivot contains all the rest elements. Partition the descending sequence gives an opponent result. There are other cases which lead quick sort performs poor. There is no completely satis

1847. ed solution which can avoid the worst case. We'll see some engineering practice in next section which can make it very seldom to meet the worst case. 13.3.1 Average case analysis ? In average case, quick sort performs well. There is a vivid example that even the partition divides the list every time to two lists with length 1 to 9. The performance is still bound to O(n lg n) as shown in [2]. This subsection need some mathematic background, reader can safely skip to next part. There are two methods to proof the average case performance, one uses an important fact that the performance is proportion to the total comparing operations during quick sort [2]. Dierent with the selections sort that every two elements have been compared. Quick sort avoid many unnecessary comparisons. For example suppose a partition operation on list fa1; a2; a3; :::; ang. Select a1 as the pivot, the partition builds two sub lists A = fx1; x2; :::; xkg and B = fy1; y2; :::; ynk1g. In the rest time of quick sort, The element in A will never

1848. 13.3. PERFORMANCE ANALYSIS FOR QUICK SORT 387 be compared with any elements in B. Denote the

1849. nal sorted result as fa1; a2; :::; ang, this indicates that if element ai aj , they will not be compared any longer if and only if some element ak where ai ak aj has ever been selected as pivot before ai or aj being selected as the pivot. That is to say, the only chance that ai and aj being compared is either ai is chosen as pivot or aj is chosen as pivot before any other elements in ordered range ai+1 ai+2 ::: aj1 are selected. Let P(i; ) represent the probability that ai and aj being compared. We have: P(i; j) = 2 j i + 1 (13.9) Since the total number of compare operation can be given as: C(n) = nX1 i=1 Xn j=i+1 P(i; j) (13.10) Note the fact that if we compared ai and aj , we won't compare aj and ai again in the quick sort algorithm, and we never compare ai onto itself. That's why we set the upper bound of i to n 1; and lower bound of j to i + 1. Substitute the probability, it yields: C(n) = nX1 i=1 Xn j=i+1 2 j i + 1 = nX1 i=1 Xni k=1 2 k + 1 (13.11) Using the harmonic series [18] Hn = 1 + 1 2 + 1 3 + :::: = ln n + + n C(n) = nX1 i=1 O(lg n) = O(n lg n) (13.12) The other method to prove the average performance is to use the recursive fact that when sorting list of length n, the partition splits the list into two sub lists with length i and ni1. The partition process itself takes cn time because it examine every element with the pivot. So we have the following equation. T(n) = T(i) + T(n i 1) + cn (13.13) Where T(n) is the total time when perform quick sort on list of length n. Since i is equally like to be any of 0; 1; :::; n1, taking math expectation to the

1850. 388CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT equation gives: T(n) = E(T(i)) + E(T(n i 1)) + cn = 1 n nX1 i=0 T(i) + 1 n nX1 i=0 T(n i 1) + cn = 1 n nX1 i=0 T(i) + 1 n nX1 j=0 T(j) + cn = 2 n Xb1 i=0 T(i) + cn (13.14) Multiply by n to both sides, the equation changes to: nT(n) = 2 nX1 i=0 T(i) + cn2 (13.15) Substitute n to n 1 gives another equation: (n 1)T(n 1) = 2 nX2 i=0 T(i) + c(n 1)2 (13.16) Subtract equation (13.15) and (13.16) can eliminate all the T(i) for 0 i n 1. nT(n) = (n + 1)T(n 1) + 2cn c (13.17) As we can drop the constant time c in computing performance. The equation can be one more step changed like below. T(n) n + 1 = T(n 1) n + 2c n + 1 (13.18) Next we assign n to n 1, n 2, ..., which gives us n 1 equations. T(n 1) n = T(n 2) n 1 + 2c n T(n 2) n 1 = T(n 3) n 2 + 2c n 1 ::: T(2) 3 = T(1) 2 + 2c 3 Sum all them up, and eliminate the same components in both sides, we can deduce to a function of n. T(n) n + 1 = T(1) 2 + 2c nX+1 k=3 1 k (13.19)

1851. 13.4. ENGINEERING IMPROVEMENT 389 Using the harmonic series mentioned above, the

1852. nal result is: O( T(n) n + 1 ) = O( T(1) 2 + 2c ln n + + n) = O(lg n) (13.20) Thus O(T(n)) = O(n lg n) (13.21) Exercise 13.2 Why Lomuto's methods performs poor when there are many duplicated elements? 13.4 Engineering Improvement Quick sort performs well in most cases as mentioned in previous section. How-ever, there does exist the worst cases which downgrade the performance to quadratic. If the data is randomly prepared, such case is rare, however, there are some particular sequences which lead to the worst case and these kinds of sequences are very common in practice. In this section, some engineering practices are introduces which either help to avoid poor performance in handling some special input data with improved partition algorithm, or try to uniform the possibilities among cases. 13.4.1 Engineering solution to duplicated elements As presented in the exercise of above section, N. Lomuto's partition method isn't good at handling sequence with many duplicated elements. Consider a sequence with n equal elements like: fx; x; :::; xg. There are actually two methods to sort it. 1. The normal basic quick sort: That we select an arbitrary element, which is x as the pivot, partition it to two sub sequences, one is fx; x; :::; xg, which contains n 1 elements, the other is empty. then recursively sort the

1853. rst one; this is obviously quadratic O(n2) solution. 2. The other way is to only pick those elements strictly smaller than x, and strictly greater than x. Such partition results two empty sub sequences, and n elements equal to the pivot. Next we recursively sort the sub se-quences contains the smaller and the bigger elements, since both of them are empty, the recursive call returns immediately; The only thing left is to concatenate the sort results in front of and after the list of elements which are equal to the pivot. The latter one performs in O(n) time if all elements are equal. This indicates an important improvement for partition. That instead of binary partition (split to two sub lists and a pivot), ternary partition (split to three sub lists) handles duplicated elements better.

1854. 390CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT We can de

1855. ne the ternary quick sort as the following. sort(L) = : L = sort(S) [ sort(E) [ sort(G) : otherwise (13.22) Where S;E;G are sub lists contains all elements which are less than, equal to, and greater than the pivot respectively. S = fxjx 2 L; x l1g E = fxjx 2 L; x = l1g G = fxjx 2 L; l1 xg The basic ternary quick sort can be implemented in Haskell as the following example code. sort [] = [] sort (x:xs) = sort [a j a xs, ax] ++ x:[b j b xs, b==x] ++ sort [c j c xs, cx] Note that the comparison between elements must support abstract `less-than' and `equal-to' operations. The basic version of ternary sort takes linear O(n) time to concatenate the three sub lists. It can be improved by using the standard techniques of accumulator. Suppose function sort0(L;A) is the accumulated ternary quick sort de

1856. ni-tion, that L is the sequence to be sorted, and the accumulator A contains the intermediate sorted result so far. We initialize the sorting with an empty accu-mulator: sort(L) = sort0(L; ). It's easy to give the trivial edge cases like below. sort 0 (L;A) = A : L = ::: : otherwise For the recursive case, as the ternary partition splits to three sub lists S;E;G, only S and G need recursive sort, E contains all elements equal to the pivot, which is in correct order thus needn't to be sorted any more. The idea is to sort G with accumulator A, then concatenate it behind E, then use this result as the new accumulator, and start to sort S: sort 0 (L;A) = A : L = sort(S;E [ sort(G;A)) : otherwise (13.23) The partition can also be realized with accumulators. It is similar to what has been developed for the basic version of quick sort. Note that we can't just pass only one predication for pivot comparison. It actually needs two, one for less-than, the other for equality testing. For the sake of brevity, we pass the pivot element instead. partition(p; L; S;E;G) = 8 : (S;E;G) : L = partition(p;L0; fl1g [ S;E;G) : l1 p partition(p;L0; S; fl1g [ E;G) : l1 = p partition(p;L0; S;E; fl1g [ G) : p l1 (13.24)

1857. 13.4. ENGINEERING IMPROVEMENT 391 Where l1 is the

1858. rst element in L if L isn't empty, and L0 contains all rest elements except for l1. Below Haskell program implements this algorithm. It starts the recursive sorting immediately in the edge case of parition. sort xs = sort' xs [] sort' [] r = r sort' (x:xs) r = part xs [] [x] [] r where part [] as bs cs r = sort' as (bs ++ sort' cs r) part (x':xs') as bs cs r j x' x = part xs' (x':as) bs cs r j x' == x = part xs' as (x':bs) cs r j x' x = part xs' as bs (x':cs) r Richard Bird developed another version in [1], that instead of concatenating the recursively sorted results, it uses a list of sorted sub lists, and performs concatenation

1859. nally. sort xs = concat $ pass xs [] pass [] xss = xss pass (x:xs) xss = step xs [] [x] [] xss where step [] as bs cs xss = pass as (bs:pass cs xss) step (x':xs') as bs cs xss j x' x = step xs' (x':as) bs cs xss j x' == x = step xs' as (x':bs) cs xss j x' x = step xs' as bs (x':cs) xss 2-way partition The cases with many duplicated elements can also be handled imperatively. Robert Sedgewick presented a partition method [3], [4] which holds two pointers. One moves from left to right, the other moves from right to left. The two pointers are initialized as the left and right boundaries of the array. When start partition, the left most element is selected as the pivot. Then the left pointer i keeps advancing to right until it meets any element which is not less than the pivot; On the other hand2, The right pointer j repeatedly scans to left until it meets any element which is not greater than the pivot. At this time, all elements before the left pointer i are strictly less than the pivot, while all elements after the right pointer j are greater than the pivot. i points to an element which is either greater than or equal to the pivot; while j points to an element which is either less than or equal to the pivot, the situation at this stage is illustrated in

1860. gure 13.4 (a). In order to partition all elements less than or equal to the pivot to the left, and the others to the right, we can exchange the two elements pointed by i, and j. After that the scan can be resumed until either i meets j, or they overlap. At any time point during partition. There is invariant that all elements before i (including the one pointed by i) are not greater than the pivot; while all elements after j (including the one pointed by j) are not less than the pivot. The elements between i and j haven't been examined yet. This invariant is shown in

1861. gure 13.4 (b). After the left pointer i meets the right pointer j, or they overlap each other, we need one extra exchanging to move the pivot located at the

1862. rst position to 2We don't use `then' because it's quite OK to perform the two scans in parallel.

1863. 392CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT pivot =pivot =pivot x[l] ... less than ... x[i] ...?... x[j] ... greater than ... (a) When pointer i, and j stop pivot i j x[l] ... not greater than ... ...?... ... not less than ... (b) Partition invariant Figure 13.4: Partition a range of array by using the left most element as the pivot. the correct place which is pointed by j. Next, the elements between the lower bound and j as well as the sub slice between i and the upper bound of the array are recursively sorted. This algorithm can be described as the following. 1: procedure Sort(A; l; u) . sort range [l; u) 2: if u l 1 then . More than 1 element for non-trivial case 3: i l, j u 4: pivot A[l] 5: loop 6: repeat 7: i i + 1 8: until A[i] pivot . Need handle error case that i u in fact. 9: repeat 10: j j 1 11: until A[j] pivot . Need handle error case that j l in fact. 12: if j i then 13: break 14: Exchange A[i] $ A[j] 15: Exchange A[l] $ A[j] . Move the pivot 16: Sort(A; l; j) 17: Sort(A; i; u) Consider the extreme case that all elements are equal, this in-place quick sort will partition the list to two equal length sub lists although it takes n 2 unneces-sary swaps. As the partition is balanced, the overall performance is O(n lg n), which avoid downgrading to quadratic. The following ANSI C example program implements this algorithm. void qsort(Key xs, int l, int u) { int i, j, pivot; if (l u - 1) { pivot = i = l; j = u; while (1) {

1864. 13.4. ENGINEERING IMPROVEMENT 393 while (i u xs[++i] xs[pivot]); while (j l xs[pivot] xs[--j]); if (j i) break; swap(xs[i], xs[j]); } swap(xs[pivot], xs[j]); qsort(xs, l, j); qsort(xs, i, u); } } Comparing this algorithm with the basic version based on N. Lumoto's par-tition method, we can

1865. nd that it swaps fewer elements, because it skips those have already in proper sides of the pivot. 3-way partition It's obviously that, we should avoid those unnecessary swapping for the dupli-cated elements. What's more, the algorithm can be developed with the idea of ternary sort (as known as 3-way partition in some materials), that all the elements which are strictly less than the pivot are put to the left sub slice, while those are greater than the pivot are put to the right. The middle part holds all the elements which are equal to the pivot. With such ternary partition, we need only recursively sort the ones which dier from the pivot. Thus in the above extreme case, there aren't any elements need further sorting. So the overall performance is linear O(n). The diculty is how to do the 3-way partition. Jon Bentley and Douglas McIlroy developed a solution which keeps those elements equal to the pivot at the left most and right most sides as shown in

1866. gure 13.5 (a) [5] [6]. pivot p i j q x[l] ... equal ... ... less than... ...?... ... greater than ... ... equal ... (a) Invariant of 3-way partition i j pivot ... less than... ... equal ... ... greater than ... (b) Swapping the equal parts to the middle Figure 13.5: 3-way partition. The majority part of scan process is as same as the one developed by Robert Sedgewick, that i and j keep advancing toward each other until they meet any element which is greater then or equal to the pivot for i, or less than or equal to the pivot for j respectively. At this time, if i and j don't meet each other or overlap, they are not only exchanged, but also examined if the elements pointed

1867. 394CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT by them are identical to the pivot. Then necessary exchanging happens between i and p, as well as j and q. By the end of the partition process, the elements equal to the pivot need to be swapped to the middle part from the left and right ends. The number of such extra exchanging operations are proportion to the number of duplicated elements. It's zero operation if elements are unique which there is no overhead in the case. The

1868. nal partition result is shown in

1869. gure 13.5 (b). After that we only need recursively sort the `less-than' and `greater-than' sub slices. This algorithm can be given by modifying the 2-way partition as below. 1: procedure Sort(A; l; u) 2: if u l 1 then 3: i l, j u 4: p l, q u . points to the boundaries for equal elements 5: pivot A[l] 6: loop 7: repeat 8: i i + 1 9: until A[i] pivot . Skip the error handling for i u 10: repeat 11: j j 1 12: until A[j] pivot . Skip the error handling for j l 13: if j i then 14: break . Note the dierence form the above algorithm 15: Exchange A[i] $ A[j] 16: if A[i] = pivot then . Handle the equal elements 17: p p + 1 18: Exchange A[p] $ A[i] 19: if A[j] = pivot then 20: q q 1 21: Exchange A[q] $ A[j] 22: if i = j ^ A[i] = pivot then . A special case 23: j j 1, i i + 1 24: for k from l to p do . Swap the equal elements to the middle part 25: Exchange A[k] $ A[j] 26: j j 1 27: for k from u 1 down-to q do 28: Exchange A[k] $ A[i] 29: i i + 1 30: Sort(A; l; j + 1) 31: Sort(A; i; u) This algorithm can be translated to the following ANSI C example program. void qsort2(Key xs, int l, int u) { int i, j, k, p, q, pivot; if (l u - 1) { i = p = l; j = q = u; pivot = xs[l]; while (1) { while (i u xs[++i] pivot); while (j l pivot xs[--j]);

1870. 13.4. ENGINEERING IMPROVEMENT 395 if (j i) break; swap(xs[i], xs[j]); if (xs[i] == pivot) { ++p; swap(xs[p], xs[i]); } if (xs[j] == pivot) { --q; swap(xs[q], xs[j]); } } if (i == j xs[i] == pivot) { --j, ++i; } for (k = l; k p; ++k, --j) swap(xs[k], xs[j]); for (k = u-1; k q; --k, ++i) swap(xs[k], xs[i]); qsort2(xs, l, j + 1); qsort2(xs, i, u); } } It can be seen that the the algorithm turns to be a bit complex when it evolves to 3-way partition. There are some tricky edge cases should be handled with caution. Actually, we just need a ternary partition algorithm. This remind us the N. Lumoto's method, which is straightforward enough to be a start point. The idea is to change the invariant a bit. We still select the

1871. rst element as the pivot, as shown in

1872. gure 13.6, at any time, the left most section contains elements which are strictly less than the pivot; the next section contains the elements equal to the pivot; the right most section holds all the elements which are strictly greater than the pivot. The boundaries of three sections are marked as i, k, and j respectively. The rest part, which is between k and j are elements haven't been scanned yet. At the beginning of this algorithm, the `less-than' section is empty; the `equal-to' section contains only one element, which is the pivot; so that i is initialized to the lower bound of the array, and k points to the element next to i. The `greater-than' section is also initialized as empty, thus j is set to the upper bound. i k j ... less than... ... equal ... ...?... ... greater than ... Figure 13.6: 3-way partition based on N. Lumoto's method. When the partition process starts, the elements pointed by k is examined. If it's equal to the pivot, k just advances to the next one; If it's greater than the pivot, we swap it with the last element in the unknown area, so that the length of `greater-than' section increases by one. It's boundary j moves to the left. Since we don't know if the elements swapped to k is still greater than the pivot, it should be examined again repeatedly. Otherwise, if the element is less than the pivot, we can exchange it with the

1873. rst one in the `equal-to' section to resume the invariant. The partition algorithm stops when k meets j. 1: procedure Sort(A; l; u) 2: if u l 1 then 3: i l, j u, k l + 1 4: pivot A[i] 5: while k j do

1874. 396CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 6: while pivot A[k] do 7: j j 1 8: Exchange A[k] $ A[j] 9: if A[k] pivot then 10: Exchange A[k] $ A[i] 11: i i + 1 12: k k + 1 13: Sort(A; l; i) 14: Sort(A; j; u) Compare this one with the previous 3-way partition quick sort algorithm, it's more simple at the cost of more swapping operations. Below ANSI C program implements this algorithm. void qsort(Key xs, int l, int u) { int i, j, k; Key pivot; if (l u - 1) { i = l; j = u; pivot = xs[l]; for (k = l + 1; k j; ++k) { while (pivot xs[k]) { --j; swap(xs[j], xs[k]); } if (xs[k] pivot) { swap(xs[i], xs[k]); ++i; } } qsort(xs, l, i); qsort(xs, j, u); } } Exercise 13.3 All the quick sort imperative algorithms use the

1875. rst element as the pivot, another method is to choose the last one as the pivot. Realize the quick sort algorithms, including the basic version, Sedgewick version, and ternary (3-way partition) version by using this approach. 13.5 Engineering solution to the worst case Although the ternary quick sort (3-way partition) solves the issue for duplicated elements, it can't handle some typical worst cases. For example if many of the elements in the sequence are ordered, no matter it's in ascending or descending order, the partition result will be two unbalanced sub sequences, one with few elements, the other contains all the rest. Consider the two extreme cases, fx1 x2 ::: xng and fy1 y2 ::: yng. The partition results are shown in

1876. gure 13.7. It's easy to give some more worst cases, for example, fxm; xm1; :::; x2; x1; xm+1; xm+2; :::xng where fx1 x2 ::: xng; Another one is fxn; x1; xn1; x2; :::g. Their parti-tion result trees are shown in

1877. gure 13.8. Observing that the bad partition happens easily when blindly choose the

1878. rst element as the pivot, there is a popular work around suggested by Robert Sedgwick in [3]. Instead of selecting the

1879. xed position in the sequence, a small sampling helps to

1880. nd a pivot which has lower possibility to cause a bad parti-tion. One option is to examine the

1881. rst element, the middle, and the last one,

1882. 13.5. ENGINEERING SOLUTION TO THE WORST CASE 397 1 2 3 ... n (a) The partition tree for fx1 x2 ::: xng, There aren't any elements less than or equal to the pivot (the

1883. rst element) in every partition. n n-1 n-2 ... 1 (b) The partition tree for fy1 y2 ::: yng, There aren't any elements greater than or equal to the pivot (the

1884. rst element) in every partition. Figure 13.7: The two worst cases.

1885. 398CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT m m-1 m+1 m-2 ... 1 m+2 ... n (a) Except for the

1886. rst partition, all the others are unbalanced. 1 2 n 3 n-1 4 ... (b) A zig-zag partition tree. Figure 13.8: Another two worst cases.

1887. 13.5. ENGINEERING SOLUTION TO THE WORST CASE 399 then choose the median of these three element. In the worst case, it can ensure that there is at least one element in the shorter partitioned sub list. Note that there is one tricky in real-world implementation. Since the index is typically represented in limited length words. It may cause over ow when calculating the middle index by the naive expression (l + u) / 2. In order to avoid this issue, it can be accessed as l + (u - l)/2. There are two methods to

1888. nd the median, one needs at most three comparisons [5]; the other is to move the minimum value to the

1889. rst location, the maximum value to the last location, and the median value to the meddle location by swapping. After that we can select the middle as the pivot. Below algorithm illustrated the second idea before calling the partition procedure. 1: procedure Sort(A; l; u) 2: if u l 1 then 3: m b l+u 2 c . Need handle over ow error in practice 4: if A[m] A[l] then . Ensure A[l] A[m] 5: Exchange A[l] $ A[r] 6: if A[u 1] A[l] then . Ensure A[l] A[u 1] 7: Exchange A[l] $ A[u 1] 8: if A[u 1] A[m] then . Ensure A[m] A[u 1] 9: Exchange A[m] $ A[u] 10: Exchange A[l] $ A[m] 11: (i; j) Partition(A; l; u) 12: Sort(A; l; i) 13: Sort(A; j; u) It's obviously that this algorithm performs well in the 4 special worst cases given above. The imperative implementation of median-of-three is left as exer-cise to the reader. However, in purely functional settings, it's expensive to randomly access the middle and the last element. We can't directly translate the imperative median selection algorithm. The idea of taking a small sampling and then

1890. nding the median element as pivot can be realized alternatively by taking the

1891. rst 3. For example, in the following Haskell program. qsort [] = [] qsort [x] = [x] qsort [x, y] = [min x y, max x y] qsort (x:y:z:rest) = qsort (filter ( m) (s:rest)) ++ [m] ++ qsort (filter ( m) (l:rest)) where xs = [x, y, z] [s, m, l] = [minimum xs, median xs, maximum xs] Unfortunately, none of the above 4 worst cases can be well handled by this program, this is because the sampling is not good. We need telescope, but not microscope to pro

1892. le the whole list to be partitioned. We'll see the functional way to solve the partition problem later. Except for the median-of-three, there is another popular engineering practice to get good partition result. instead of always taking the

1893. rst element or the last one as the pivot. One alternative is to randomly select one. For example as the following modi

1894. cation. 1: procedure Sort(A; l; u) 2: if u l 1 then

1895. 400CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 3: Exchange A[l] $ A[ Random(l; u) ] 4: (i; j) Partition(A; l; u) 5: Sort(A; l; i) 6: Sort(A; j; u) The function Random(l; u) returns a random integer i between l and u, that l i u. The element at this position is exchanged with the

1896. rst one, so that it is selected as the pivot for the further partition. This algorithm is called random quick sort [2]. Theoretically, neither median-of-three nor random quick sort can avoid the worst case completely. If the sequence to be sorted is randomly distributed, no matter choosing the

1897. rst one as the pivot, or the any other arbitrary one are equally in eect. Considering the underlying data structure of the sequence is singly linked-list in functional setting, it's expensive to strictly apply the idea of random quick sort in purely functional approach. Even with this bad news, the engineering improvement still makes sense in real world programming. 13.6 Other engineering practice There is some other engineering practice which doesn't focus on solving the bad partition issue. Robert Sedgewick observed that when the list to be sorted is short, the overhead introduced by quick sort is relative expense, on the other hand, the insertion sort performs better in such case [4], [5]. Sedgewick, Bentley and McIlroy tried dierent threshold, as known as `Cut-O', that when there are lesson than `Cut-O' elements, the sort algorithm falls back to insertion sort. 1: procedure Sort(A; l; u) 2: if u l Cut-Off then 3: Quick-Sort(A; l; u) 4: else 5: Insertion-Sort(A; l; u) The implementation of this improvement is left as exercise to the reader. Exercise 13.4 Can you

1898. gure out more quick sort worst cases besides the four given in this section? Implement median-of-three method in your favorite imperative program-ming language. Implement random quick sort in your favorite imperative programming language. Implement the algorithm which falls back to insertion sort when the length of list is small in both imperative and functional approach.

1899. 13.7. SIDE WORDS 401 13.7 Side words It's sometimes called `true quick sort' if the implementation equipped with most of the engineering practice we introduced, including insertion sort fall-back with cut-o, in-place exchanging, choose the pivot by median-of-three method, 3-way-partition. The purely functional one, which express the idea of quick sort perfect can't take all of them. Thus someone think the functional quick sort is essentially tree sort. Actually, quick sort does have close relationship with tree sort. Richard Bird shows how to derive quick sort from binary tree sort by deforestation [7]. Consider a binary search tree creation algorithm called unfold. Which turns a list of elements into a binary search tree. unfold(L) = : L = tree(Tl; l1; Tr) : otherwise (13.25) Where Tl = unfold(faja 2 L0; a l1g) Tr = unfold(faja 2 L0; l1 ag) (13.26) The interesting point is that, this algorithm creates tree in a dierent way as we introduced in the chapter of binary search tree. If the list to be unfold is empty, the result is obviously an empty tree. This is the trivial edge case; Otherwise, the algorithm set the

1900. rst element l1 in the list as the key of the node, and recursively creates its left and right children. Where the elements used to form the left child is those which are less than or equal to the key in L0, while the rest elements which are greater than the key are used to form the right child. Remind the algorithm which turns a binary search tree to a list by in-order traversing: toList(T) = : T = toList(left(T)) [ fkey(T)g [ toList(right(T)) : otherwise (13.27) We can de

1901. ne quick sort algorithm by composing these two functions. quickSort = toList unfold (13.28) The binary search tree built in the

1902. rst step of applying unfold is the inter-mediate result. This result is consumed by toList and dropped after the second step. It's quite possible to eliminate this intermediate result, which leads to the basic version of quick sort. The elimination of the intermediate binary search tree is called deforestation. This concept is based on Burstle-Darlington's work [9]. 13.8 Merge sort Although quick sort performs perfectly in average cases, it can't avoid the worst case no matter what engineering practice is applied. Merge sort, on the other

1903. 402CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT kind, ensure the performance is bound to O(n lg n) in all the cases. It's par-ticularly useful in theoretical algorithm design and analysis. Another feature is that merge sort is friendly for linked-space settings, which is suitable for sorting nonconsecutive stored sequences. Some functional programming and dynamic programming environments adopt merge sort as the standard library sorting solution, such as Haskel, Python and Java (later than Java 7). In this section, we'll

1904. rst brief the intuitive idea of merge sort, provide a basic version. After that, some variants of merge sort will be given including nature merge sort, and bottom-up merge sort. 13.8.1 Basic version Same as quick sort, the essential idea behind merge sort is also divide and con-quer. Dierent from quick sort, merge sort enforces the divide to be strictly balanced, that it always splits the sequence to be sorted at the middle point. After that, it recursively sort the sub sequences and merge the sorted two se-quences to the

1905. nal result. The algorithm can be described as the following. In order to sort a sequence L, Trivial edge case: If the sequence to be sorted is empty, the result is obvious empty; Otherwise, split the sequence at the middle position, recursively sort the two sub sequences and merge the result. The basic merge sort algorithm can be formalized with the following equa-tion. sort(L) = : L = merge(sort(L1); sort(L2)) : otherwise; (L1;L2) = splitAt(b jLj 2 c;L) (13.29) Merge There are two `black-boxes' in the above merge sort de

1906. nition, one is the splitAt function, which splits a list at a given position; the other is the merge function, which can merge two sorted lists into one. As presented in the appendix of this book, it's trivial to realize splitAt in imperative settings by using random access. However, in functional settings, it's typically realized as a linear algorithm: splitAt(n;L) = (;L) : n = 0 (fl1g [ A;B) : otherwise; (A;B) = splitAt(n 1;L0) (13.30) Where l1 is the

1907. rst element of L, and L0 represents the rest elements except of l1 if L isn't empty. The idea of merge can be illustrated as in

1908. gure 13.9. Consider two lines of kids. The kids have already stood in order of their heights. that the shortest one stands at the

1909. rst, then a taller one, the tallest one stands at the end of the line.

1910. 13.8. MERGE SORT 403 Figure 13.9: Two lines of kids pass a door. Now let's ask the kids to pass a door one by one, every time there can be at most one kid pass the door. The kids must pass this door in the order of their height. The one can't pass the door before all the kids who are shorter than him/her. Since the two lines of kids have already been `sorted', the solution is to ask the

1911. rst two kids, one from each line, compare their height, and let the shorter kid pass the door; Then they repeat this step until one line is empty, after that, all the rest kids can pass the door one by one. This idea can be formalized in the following equation. merge(A;B) = 8 : A : B = B : A = fa1g [ merge(A0;B) : a1 b1 fb1g [ merge(A;B0) : otherwise (13.31) Where a1 and b1 are the

1912. rst elements in list A and B; A0 and B0 are the rest elements except for the

1913. rst ones respectively. The

1914. rst two cases are trivial edge cases. That merge one sorted list with an empty list results the same sorted list; Otherwise, if both lists are non-empty, we take the

1915. rst elements from the two lists, compare them, and use the minimum as the

1916. rst one of the result, then recursively merge the rest. With merge de

1917. ned, the basic version of merge sort can be implemented like the following Haskell example code. msort [] = [] msort [x] = [x] msort xs = merge (msort as) (msort bs) where (as, bs) = splitAt (length xs `div` 2) xs merge xs [] = xs

1918. 404CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT merge [] ys = ys merge (x:xs) (y:ys) j x y = x : merge xs (y:ys) j x y = y : merge (x:xs) ys Note that, the implementation diers from the algorithm de

1919. nition that it treats the singleton list as trivial edge case as well. Merge sort can also be realized imperatively. The basic version can be de-veloped as the below algorithm. 1: procedure Sort(A) 2: if jAj 1 then 3: m b jAj 2 c 4: X Copy-Array(A[1:::m]) 5: Y Copy-Array(A[m + 1:::jAj]) 6: Sort(X) 7: Sort(Y ) 8: Merge(A;X; Y ) When the array to be sorted contains at least two elements, the non-trivial sorting process starts. It

1920. rst copy the

1921. rst half to a new created array A, and the second half to a second new array B. Recursively sort them; and

1922. nally merge the sorted result back to A. This version uses the same amount of extra spaces of A. This is because the Merge algorithm isn't in-place at the moment. We'll introduce the imperative in-place merge sort in later section. The merge process almost does the same thing as the functional de

1923. nition. There is a verbose version and a simpli

1924. ed version by using sentinel. The verbose merge algorithm continuously checks the element from the two input arrays, picks the smaller one and puts it back to the result array A, it then advances along the arrays respectively until either one input array is exhausted. After that, the algorithm appends the rest of the elements in the other input array to A. 1: procedure Merge(A;X; Y ) 2: i 1; j 1; k 1 3: m jXj; n jY j 4: while i m ^ j n do 5: if X[i] Y [j] then 6: A[k] X[i] 7: i i + 1 8: else 9: A[k] Y [j] 10: j j + 1 11: k k + 1 12: while i m do 13: A[k] X[i] 14: k k + 1 15: i i + 1 16: while j n do 17: A[k] Y [j] 18: k k + 1 19: j j + 1

1925. 13.8. MERGE SORT 405 Although this algorithm is a bit verbose, it can be short in some program-ming environment with enough tools to manipulate array. The following Python program is an example. def msort(xs): n = len(xs) if n 1: ys = [x for x in xs[:n=2]] zs = [x for x in xs[n=2:]] ys = msort(ys) zs = msort(zs) xs = merge(xs, ys, zs) return xs def merge(xs, ys, zs): i = 0 while ys != [] and zs != []: xs[i] = ys.pop(0) if ys[0] zs[0] else zs.pop(0) i = i + 1 xs[i:] = ys if ys !=[] else zs return xs Performance Before dive into the improvement of this basic version, let's analyze the perfor-mance of merge sort. The algorithm contains two steps, divide step, and merge step. In divide step, the sequence to be sorted is always divided into two sub sequences with the same length. If we draw a similar partition tree as what we did for quick sort, it can be found this tree is a perfectly balanced binary tree as shown in

1926. gure 13.3. Thus the height of this tree is O(lg n). It means the recursion depth of merge sort is bound to O(lg n). Merge happens in every level. It's intuitive to analyze the merge algorithm, that it compare elements from two input sequences in pairs, after one sequence is fully examined the rest one is copied one by one to the result, thus it's a linear algorithm proportion to the length of the sequence. Based on this facts, denote T(n) the time for sorting the sequence with length n, we can write the recursive time cost as below. T(n) = T( n 2 ) + T( n 2 ) + cn = 2T( n 2 ) + cn (13.32) It states that the cost consists of three parts: merge sort the

1927. rst half takes T(n 2 ), merge sort the second half takes also T(n 2 ), merge the two results takes cn, where c is some constant. Solve this equation gives the result as O(n lg n). Note that, this performance doesn't vary in all cases, as merge sort always uniformly divides the input. Another signi

1928. cant performance indicator is space occupation. However, it varies a lot in dierent merge sort implementation. The detail space bounds analysis will be explained in every detailed variants later. For the basic imperative merge sort, observe that it demands same amount of spaces as the input array in every recursion, copies the original elements

1929. 406CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT to them for recursive sort, and these spaces can be released after this level of recursion. So the peak space requirement happens when the recursion enters to the deepest level, which is O(n lg n). The functional merge sort consume much less than this amount, because the underlying data structure of the sequence is linked-list. Thus it needn't extra spaces for merge3. The only spaces requirement is for book-keeping the stack for recursive calls. This can be seen in the later explanation of even-odd split algorithm. Minor improvement We'll next improve the basic merge sort bit by bit for both the functional and imperative realizations. The

1930. rst observation is that the imperative merge al-gorithm is a bit verbose. [2] presents an elegant simpli

1931. cation by using positive 1 as the sentinel. That we append 1 as the last element to the both ordered arrays for merging4. Thus we needn't test which array is not exhausted. Figure 13.10 illustrates this idea. a[i] ... a[n] INF x[1] x[2] ... x[k] b[j] ... b[m] INF Figure 13.10: Merge with 1 as sentinels. 1: procedure Merge(A;X; Y ) 2: Append(X;1) 3: Append(Y;1) 4: i 1; j 1 5: for k from 1 to jAj do 6: if X[i] Y [j] then 7: A[k] X[i] 8: i i + 1 9: else 10: A[k] Y [j] 11: j j + 1 The following ANSI C program imlements this idea. It embeds the merge in-side. INF is de

1932. ned as a big constant number with the same type of Key. Where 3The complex eects caused by lazy evaluation is ignored here, please refer to [7] for detail 4For sorting in monotonic non-increasing order, 1 can be used instead

1933. 13.8. MERGE SORT 407 the type can either be de

1934. ned elsewhere or we can abstract the type informa-tion by passing the comparator as parameter. We skip these implementation and language details here. void msort(Key xs, int l, int u) { int i, j, m; Key as, bs; if (u - l 1) { m = l + (u - l) = 2; = avoid int overflow = msort(xs, l, m); msort(xs, m, u); as = (Key) malloc(sizeof(Key) (m - l + 1)); bs = (Key) malloc(sizeof(Key) (u - m + 1)); memcpy((void)as, (void)(xs + l), sizeof(Key) (m - l)); memcpy((void)bs, (void)(xs + m), sizeof(Key) (u - m)); as[m - l] = bs[u - m] = INF; for (i = j = 0; l u; ++l) xs[l] = as[i] bs[j] ? as[i++] : bs[j++]; free(as); free(bs); } } Running this program takes much more time than the quick sort. Besides the major reason we'll explain later, one problem is that this version frequently allocates and releases memories for merging. While memory allocation is one of the well known bottle-neck in real world as mentioned by Bentley in [4]. One solution to address this issue is to allocate another array with the same size to the original one as the working area. The recursive sort for the

1935. rst and second halves needn't allocate any more extra spaces, but use the working area when merging. Finally, the algorithm copies the merged result back. This idea can be expressed as the following modi

1936. ed algorithm. 1: procedure Sort(A) 2: B Create-Array(jAj) 3: Sort'(A;B; 1; jAj) 4: procedure Sort'(A;B; l; u) 5: if u l 0 then 6: m b l+u 2 c 7: Sort'(A;B; l;m) 8: Sort'(A;B;m + 1; u) 9: Merge'(A;B; l; m; u) This algorithm duplicates another array, and pass it along with the original array to be sorted to Sort' algorithm. In real implementation, this working area should be released either manually, or by some automatic tool such as GC (Garbage collection). The modi

1937. ed algorithm Merge' also accepts a working area as parameter. 1: procedure Merge'(A;B; l; m; u) 2: i l; j m + 1; k l 3: while i m ^ j u do 4: if A[i] A[j] then

1938. 408CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 5: B[k] A[i] 6: i i + 1 7: else 8: B[k] A[j] 9: j j + 1 10: k k + 1 11: while i m do 12: B[k] A[i] 13: k k + 1 14: i i + 1 15: while j u do 16: B[k] A[j] 17: k k + 1 18: j j + 1 19: for i from l to u do . Copy back 20: A[i] B[i] By using this minor improvement, the space requirement reduced to O(n) from O(n lg n). The following ANSI C program implements this minor improve-ment. For illustration purpose, we manually copy the merged result back to the original array in a loop. This can also be realized by using standard library provided tool, such as memcpy. void merge(Key xs, Key ys, int l, int m, int u) { int i, j, k; i = k = l; j = m; while (i m j u) ys[k++] = xs[i] xs[j] ? xs[i++] : xs[j++]; while (i m) ys[k++] = xs[i++]; while (j u) ys[k++] = xs[j++]; for(; l u; ++l) xs[l] = ys[l]; } void msort(Key xs, Key ys, int l, int u) { int m; if (u - l 1) { m = l + (u - l) = 2; msort(xs, ys, l, m); msort(xs, ys, m, u); merge(xs, ys, l, m, u); } } void sort(Key xs, int l, int u) { Key ys = (Key) malloc(sizeof(Key) (u - l)); kmsort(xs, ys, l, u); free(ys); }

1939. 13.9. IN-PLACE MERGE SORT 409 This new version runs faster than the previous one. In my test machine, it speeds up about 20% to 25% when sorting 100,000 randomly generated numbers. The basic functional merge sort can also be

1940. ne tuned. Observe that, it splits the list at the middle point. However, as the underlying data structure to represent list is singly linked-list, random access at a given position is a linear operation (refer to appendix A for detail). Alternatively, one can split the list in an even-odd manner. That all the elements in even position are collected in one sub list, while all the odd elements are collected in another. As for any lists, there are either same amount of elements in even and odd positions, or they dier by one. So this divide strategy always leads to well splitting, thus the performance can be ensured to be O(n lg n) in all cases. The even-odd splitting algorithm can be de

1941. ned as below. split(L) = 8 : (; ) : L = (fl1g; ) : jLj = 1 (fl1g [ A; fl2g [ B) : otherwise; (A;B) = split(L00) (13.33) When the list is empty, the split result are two empty lists; If there is only one element in the list, we put this single element, which is at position 1, to the odd sub list, the even sub list is empty; Otherwise, it means there are at least two elements in the list, We pick the

1942. rst one to the odd sub list, the second one to the even sub list, and recursively split the rest elements. All the other functions are kept same, the modi

1943. ed Haskell program is given as the following. split [] = ([], []) split [x] = ([x], []) split (x:y:xs) = (x:xs', y:ys') where (xs', ys') = split xs 13.9 In-place merge sort One drawback for the imperative merge sort is that it requires extra spaces for merging, the basic version without any optimization needs O(n lg n) in peak time, and the one by allocating a working area needs O(n). It's nature for people to seek the in-place version merge sort, which can reuse the original array without allocating any extra spaces. In this section, we'll introduce some solutions to realize imperative in-place merge sort. 13.9.1 Naive in-place merge The

1944. rst idea is straightforward. As illustrated in

1945. gure 13.11, sub list A, and B are sorted, when performs in-place merge, the variant ensures that all elements before i are merged, so that they are in non-decreasing order; every time we compare the i-th and the j-th elements. If the i-th is less than the j-th, the marker i just advances one step to the next. This is the easy case. Otherwise, it means that the j-th element is the next merge result, which should be put in front of i. In order to achieve this, all elements between i and j, including the i-th should be shift to the end by one cell. We repeat this process till all the elements in A and B are put to the correct positions.

1946. 410CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT shift if not xs[i] xs[j] merged xs[i] ...sorted sub list A... xs[j] ...sorted sub list B... Figure 13.11: Naive in-place merge 1: procedure Merge(A; l; m; u) 2: while l m ^ m u do 3: if A[l] A[m] then 4: l l + 1 5: else 6: x A[m] 7: for i m down-to l + 1 do . Shift 8: A[i] A[i 1] 9: A[l] x However, this naive solution downgrades merge sort overall performance to quadratic O(n2)! This is because that array shifting is a linear operation. It is proportion to the length of elements in the

1947. rst sorted sub array which haven't been compared so far. The following ANSI C program based on this algorithm runs very slow, that it takes about 12 times slower than the previous version when sorting 10,000 random numbers. void naive_merge(Key xs, int l, int m, int u) { int i; Key y; for(; l m m u; ++l) if (!(xs[l] xs[m])) { y = xs[m++]; for (i = m - 1; i l; --i) = shift = xs[i] = xs[i-1]; xs[l] = y; } } void msort3(Key xs, int l, int u) { int m; if (u - l 1) { m = l + (u - l) = 2; msort3(xs, l, m); msort3(xs, m, u); naive_merge(xs, l, m, u); } } 13.9.2 in-place working area In order to implement the in-place merge sort in O(n lg n) time, when sorting a sub array, the rest part of the array must be reused as working area for merging.

1948. 13.9. IN-PLACE MERGE SORT 411 As the elements stored in the working area, will be sorted later, they can't be overwritten. We can modify the previous algorithm, which duplicates extra spaces for merging, a bit to achieve this. The idea is that, every time when we compare the

1949. rst elements in the two sorted sub arrays, if we want to put the less element to the target position in the working area, we in-turn exchange what sored in the working area with this element. Thus after merging the two sub arrays store what the working area previously contains. This idea can be illustrated in

1950. gure 13.12. compare ... reuse ... A[i] ... ... reuse ... B[j] ... swap(A[i], C[k]) if A[i] B[j] ... merged ... C[k] ... Figure 13.12: Merge without overwriting working area. In our algorithm, both the two sorted sub arrays, and the working area for merging are parts of the original array to be sorted. we need supply the following arguments when merging: the start points and end points of the sorted sub arrays, which can be represented as ranges; and the start point of the working area. The following algorithm for example, uses [a; b) to indicate the range include a, exclude b. It merges sorted range [i;m) and range [j; n) to the working area starts from k. 1: procedure Merge(A; [i;m); [j; n); k) 2: while i m ^ j n do 3: if A[i] A[j] then 4: Exchange A[k] $ A[i] 5: i i + 1 6: else 7: Exchange A[k] $ A[j] 8: j j + 1 9: k k + 1 10: while i m do 11: Exchange A[k] $ A[i] 12: i i + 1 13: k k + 1 14: while j m do 15: Exchange A[k] $ A[j] 16: j j + 1 17: k k + 1 Note that, the following two constraints must be satis

1951. ed when merging: 1. The working area should be within the bounds of the array. In other

1952. 412CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT words, it should be big enough to hold elements exchanged in without causing any out-of-bound error; 2. The working area can be overlapped with either of the two sorted arrays, however, it should be ensured that there are not any unmerged elements being overwritten; This algorithm can be implemented in ANSI C as the following example. void wmerge(Key xs, int i, int m, int j, int n, int w) { while (i m j n) swap(xs, w++, xs[i] xs[j] ? i++ : j++); while (i m) swap(xs, w++, i++); while (j n) swap(xs, w++, j++); } With this merging algorithm de

1953. ned, it's easy to imagine a solution, which can sort half of the array; The next question is, how to deal with the rest of the unsorted part stored in the working area as shown in

1954. gure 13.13? ...unsorted... ... sorted ... Figure 13.13: Half of the array is sorted. One intuitive idea is to recursively sort another half of the working area, thus there are only 1 4 elements haven't been sorted yet. Which is shown in

1955. gure 13.14. The key point at this stage is that we must merge the sorted 1 4 elements B with the sorted 1 2 elements A sooner or later. unsorted 1/4 sorted B 1/4 ... ... sorted A 1/2 ... ... Figure 13.14: A and B must be merged at sometime. Is the working area left, which only holds 1 4 elements, big enough for merging A and B? Unfortunately, it isn't in the settings shown in

1956. gure 13.14. However, the second constraint mentioned before gives us a hint, that we can exploit it by arranging the working area to overlap with either sub array if we can ensure the unmerged elements won't be overwritten under some well designed merging schema. Actually, instead of sorting the second half of the working area, we can sort the

1957. rst half, and put the working area between the two sorted arrays as shown in

1958. gure 13.15 (a). This setup eects arranging the working area to overlap with the sub array A. This idea is proposed in [10]. Let's consider two extreme cases: 1. All the elements in B are less than any element in A. In this case, the merge algorithm

1959. nally moves the whole contents of B to the working

1960. 13.9. IN-PLACE MERGE SORT 413 sorted B 1/4 work area ... ... sorted A 1/2 ... ... (a) work area 1/4 ... ... ... ... merged 3/4 ... ... ... ... (b) Figure 13.15: Merge A and B with the working area. area; the cells of B holds what previously stored in the working area; As the size of area is as same as B, it's OK to exchange their contents; 2. All the elements in A are less than any element in B. In this case, the merge algorithm continuously exchanges elements between A and the working area. After all the previous 1 4 cells in the working area are

1961. lled with elements from A, the algorithm starts to overwrite the

1962. rst half of A. Fortunately, the contents being overwritten are not those unmerged elements. The working area is in eect advances toward the end of the array, and

1963. nally moves to the right side; From this time point, the merge algorithm starts exchanging contents in B with the working area. The result is that the working area moves to the left most side which is shown in

1964. gure 13.15 (b). We can repeat this step, that always sort the second half of the unsorted part, and exchange the sorted sub array to the

1965. rst half as working area. Thus we keep reducing the working area from 1 2 of the array, 1 4 of the array, 1 8 of the array, ... The scale of the merge problem keeps reducing. When there is only one element left in the working area, we needn't sort it any more since the singleton array is sorted by nature. Merging a singleton array to the other is equivalent to insert the element. In practice, the algorithm can

1966. nalize the last few elements by switching to insertion sort. The whole algorithm can be described as the following. 1: procedure Sort(A; l; u) 2: if u l 0 then 3: m b l+u 2 c 4: w l + u m 5: Sort'(A; l; m;w) . The second half contains sorted elements 6: while w l 1 do 7: u0 w d 0 8: w l+u 2 e . Ensure the working area is big enough 9: Sort'(A;w; u0; l) . The

1967. rst half holds the sorted elements 10: Merge(A; [l; l + u0 w]; [u0; u];w) 11: for i w down-to l do . Switch to insertion sort 12: j i 13: while j u ^ A[j] A[j 1] do 14: Exchange A[j] $ A[j 1]

1968. 414CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 15: j j + 1 Note that in order to satisfy the

1969. rst constraint, we must ensure the working area is big enough to hold all exchanged in elements, that's way we round it by ceiling when sort the second half of the working area. Note that we actually pass the ranges including the end points to the algorithm Merge. Next, we develop a Sort' algorithm, which mutually recursive call Sort and exchange the result to the working area. 1: procedure Sort'(A; l; u;w) 2: if u l 0 then 3: m b l+u 2 c 4: Sort(A; l;m) 5: Sort(A;m + 1; u) 6: Merge(A; [l;m]; [m + 1; u];w) 7: else . Exchange all elements to the working area 8: while l u do 9: Exchange A[l] $ A[w] 10: l l + 1 11: w w + 1 Dierent from the naive in-place sort, this algorithm doesn't shift the array during merging. The main algorithm reduces the unsorted part in sequence of n 2 ; n 4 ; n 8 ; :::, it takes O(lg n) steps to complete sorting. In every step, It recursively sorts half of the rest elements, and performs linear time merging. Denote the time cost of sorting n elements as T(n), we have the following equation. T(n) = T( n 2 ) + c n 2 + T( n 4 ) + c 3n 4 + T( n 8 ) + c 7n 8 + ::: (13.34) Solving this equation by using telescope method, gets the result O(n lg n). The detailed process is left as exercise to the reader. The following ANSI C code completes the implementation by using the ex-ample wmerge program given above. void imsort(Key xs, int l, int u); void wsort(Key xs, int l, int u, int w) { int m; if (u - l 1) { m = l + (u - l) = 2; imsort(xs, l, m); imsort(xs, m, u); wmerge(xs, l, m, m, u, w); } else while (l u) swap(xs, l++, w++); } void imsort(Key xs, int l, int u) { int m, n, w; if (u - l 1) { m = l + (u - l) = 2;

1970. 13.9. IN-PLACE MERGE SORT 415 w = l + u - m; wsort(xs, l, m, w); = the last half contains sorted elements = while (w - l 2) { n = w; w = l + (n - l + 1) = 2; = ceiling = wsort(xs, w, n, l); = the first half contains sorted elements = wmerge(xs, l, l + n - w, n, u, w); } for (n = w; n l; --n) =switch to insertion sort= for (m = n; m u xs[m] xs[m-1]; ++m) swap(xs, m, m - 1); } } However, this program doesn't run faster than the version we developed in previous section, which doubles the array in advance as working area. In my machine, it is about 60% slower when sorting 100,000 random numbers due to many swap operations. 13.9.3 In-place merge sort vs. linked-list merge sort The in-place merge sort is still a live area for research. In order to save the extra spaces for merging, some overhead has be introduced, which increases the complexity of the merge sort algorithm. However, if the underlying data structure isn't array, but linked-list, merge can be achieved without any extra spaces as shown in the even-odd functional merge sort algorithm presented in previous section. In order to make it clearer, we can develop a purely imperative linked-list merge sort solution. The linked-list can be de

1971. ned as a record type as shown in appendix A like below. struct Node { Key key; struct Node next; }; We can de

1972. ne an auxiliary function for node linking. Assume the list to be linked isn't empty, it can be implemented as the following. struct Node link(struct Node xs, struct Node ys) { xs!next = ys; return xs; } One method to realize the imperative even-odd splitting, is to initialize two empty sub lists. Then iterate the list to be split. Every time, we link the current node in front of the

1973. rst sub list, then exchange the two sub lists. So that, the second sub list will be linked at the next time iteration. This idea can be illustrated as below. 1: function Split(L) 2: (A;B) (; ) 3: while L6= do 4: p L 5: L Next(L)

1974. 416CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 6: A Link(p;A) 7: Exchange A $ B 8: return (A;B) The following example ANSI C program implements this splitting algorithm embedded. struct Node msort(struct Node xs) { struct Node p, as, bs; if (!xs j j !xs!next) return xs; as = bs = NULL; while(xs) { p = xs; xs = xs!next; as = link(p, as); swap(as, bs); } as = msort(as); bs = msort(bs); return merge(as, bs); } The only thing left is to develop the imperative merging algorithm for linked-list. The idea is quite similar to the array merging version. As long as neither of the sub lists is exhausted, we pick the less one, and append it to the result list. After that, it just need link the non-empty one to the tail the result, but not a looping for copying. It needs some carefulness to initialize the result list, as its head node is the less one among the two sub lists. One simple method is to use a dummy sentinel head, and drop it before returning. This implementation detail can be given as the following. struct Node merge(struct Node as, struct Node bs) { struct Node s, p; p = s; while (as bs) { if (as!key bs!key) { link(p, as); as = as!next; } else { link(p, bs); bs = bs!next; } p = p!next; } if (as) link(p, as); if (bs) link(p, bs); return s.next; } Exercise 13.5 Proof the performance of in-place merge sort is bound to O(n lg n).

1975. 13.10. NATURE MERGE SORT 417 13.10 Nature merge sort Knuth gives another way to interpret the idea of divide and conquer merge sort. It just likes burn a candle in both ends [1]. This leads to the nature merge sort algorithm. Figure 13.16: Burn a candle from both ends For any given sequence, we can always

1976. nd a non-decreasing sub sequence starts at any position. One particular case is that we can

1977. nd such a sub sequence from the left-most position. The following table list some examples, the non-decreasing sub sequences are in bold font. 15 , 0, 4, 3, 5, 2, 7, 1, 12, 14, 13, 8, 9, 6, 10, 11 8, 12, 14 , 0, 1, 4, 11, 2, 3, 5, 9, 13, 10, 6, 15, 7 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 The

1978. rst row in the table illustrates the worst case, that the second element is less than the

1979. rst one, so the non-decreasing sub sequence is a singleton list, which only contains the

1980. rst element; The last row shows the best case, the the sequence is ordered, and the non-decreasing list is the whole; The second row shows the average case. Symmetrically, we can always

1981. nd a non-decreasing sub sequence from the end of the sequence to the left. This indicates us that we can merge the two non-decreasing sub sequences, one from the beginning, the other form the ending to a longer sorted sequence. The advantage of this idea is that, we utilize the nature ordered sub sequences, so that we needn't recursive sorting at all. 8, 12, 14 0, 1, 4, 11 2, 3, 5 9 13, 10, 6 15, 7 merge merge 7, 8, 12, 14, 15 ... free cells ... 13, 11, 10, 6, 4, 1, 0 Figure 13.17: Nature merge sort Figure 13.17 illustrates this idea. We starts the algorithm by scanning from both ends,

1982. nding the longest non-decreasing sub sequences respectively. After

1983. 418CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT that, these two sub sequences are merged to the working area. The merged result starts from beginning. Next we repeat this step, which goes on scanning toward the center of the original sequence. This time we merge the two ordered sub sequences to the right hand of the working area toward the left. Such setup is easy for the next round of scanning. When all the elements in the original sequence have been scanned and merged to the target, we switch to use the elements stored in the working area for sorting, and use the previous sequence as new working area. Such switching happens repeatedly in each round. Finally, we copy all elements from the working area to the original array if necessary. The only question left is when this algorithm stops. The answer is that when we start a new round of scanning, and

1984. nd that the longest non-decreasing sub list spans to the end, which means the whole list is ordered, the sorting is done. Because this kind of merge sort proceeds the target sequence in two ways, and uses the nature ordering of sub sequences, it's named nature two-way merge sort. In order to realize it, some carefulness must be paid. Figure 13.18 shows the invariant during the nature merge sort. At anytime, all elements before marker a and after marker d have been already scanned and merged. We are trying to span the non-decreasing sub sequence [a; b) as long as possible, at the same time, we span the sub sequence from right to left to span [c; d) as long as possible as well. The invariant for the working area is shown in the second row. All elements before f and after r have already been sorted. (Note that they may contain several ordered sub sequences), For the odd times (1, 3, 5, ...), we merge [a; b) and [c; d) from f toword right; while for the even times (2, 4, 6, ...), we merge the two sorted sub sequences after r toward left. a b c d ... scanned ... ... span [a, b) ... ... ? ... ... span [c, d) ... ... scanned ... f r ... merged ... ... unused free cells ... ... merged ... Figure 13.18: Invariant during nature merge sort For imperative realization, the sequence is represented by array. Before sorting starts, we duplicate the array to create a working area. The pointers a; b are initialized to point the left most position, while c; d point to the right most position. Pointer f starts by pointing to the front of the working area,

1985. 13.10. NATURE MERGE SORT 419 and r points to the rear position. 1: function Sort(A) 2: if jAj 1 then 3: n jAj 4: B Create-Array(n) . Create the working area 5: loop 6: [a; b) [1; 1) 7: [c; d) [n + 1; n + 1) 8: f 1; r n . front and rear pointers to the working area 9: t False . merge to front or rear 10: while b c do . There are still elements for scan 11: repeat . Span [a; b) 12: b b + 1 13: until b c _ A[b] A[b 1] 14: repeat . Span [c; d) 15: c c 1 16: until c b _ A[c 1] A[c] 17: if c b then . Avoid overlap 18: c b 19: if b a n then . Done if [a; b) spans to the whole array 20: return A 21: if t then . merge to front 22: f Merge(A; [a; b); [c; d);B; f; 1) 23: else . merge to rear 24: r Merge(A; [a; b); [c; d);B; r;1) 25: a b; d c 26: t :t . Switch the merge direction 27: Exchange A $ B . Switch working area 28: return A The merge algorithm is almost as same as before except that we need pass a parameter to indicate the direction for merging. 1: function Merge(A; [a; b); [c; d);B;w; ) 2: while a b ^ c d do 3: if A[a] A[d 1] then 4: B[w] A[a] 5: a a + 1 6: else 7: B[w] A[d 1] 8: d d 1 9: w w + 10: while a b do 11: B[w] A[a] 12: a a + 1 13: w w + 14: while c d do 15: B[w] A[d 1] 16: d d 1 17: w w +

1986. 420CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT 18: return w The following ANSI C program implements this two-way nature merge sort algorithm. Note that it doesn't release the allocated working area explictly. int merge(Key xs, int a, int b, int c, int d, Key ys, int k, int delta) { for(; a b c d; k += delta ) ys[k] = xs[a] xs[d-1] ? xs[a++] : xs[--d]; for(; a b; k += delta) ys[k] = xs[a++]; for(; c d; k += delta) ys[k] = xs[--d]; return k; } Key sort(Key xs, Key ys, int n) { int a, b, c, d, f, r, t; if(n 2) return xs; for(;;) { a = b = 0; c = d = n; f = 0; r = n-1; t = 1; while(b c) { do { = span [a, b) as much as possible = ++b; } while( b c xs[b-1] xs[b] ); do{ = span [c, d) as much as possible = --c; } while( b c xs[c] xs[c-1] ); if( c b ) c = b; = eliminate overlap if any = if( b - a n) return xs; = sorted = if( t ) f = merge(xs, a, b, c, d, ys, f, 1); else r = merge(xs, a, b, c, d, ys, r, -1); a = b; d = c; t = !t; } swap(xs, ys); } return xs; =can't be here= } The performance of nature merge sort depends on the actual ordering of the sub arrays. However, it in fact performs well even in the worst case. Sup-pose that we are unlucky when scanning the array, that the length of the non-decreasing sub arrays are always 1 during the

1987. rst round scan. This leads to the result working area with merged ordered sub arrays of length 2. Suppose that we are unlucky again in the second round of scan, however, the previous

1988. 13.10. NATURE MERGE SORT 421 results ensure that the non-decreasing sub arrays in this round are no shorter than 2, this time, the working area will be

1989. lled with merged ordered sub arrays of length 4, ... Repeat this we get the length of the non-decreasing sub arrays doubled in every round, so there are at most O(lg n) rounds, and in every round we scanned all the elements. The overall performance for this worst case is bound to O(n lg n). We'll go back to this interesting phenomena in the next section about bottom-up merge sort. In purely functional settings however, it's not sensible to scan list from both ends since the underlying data structure is singly linked-list. The nature merge sort can be realized in another approach. Observe that the list to be sorted is consist of several non-decreasing sub lists, that we can pick every two of such sub lists and merge them to a bigger one. We repeatedly pick and merge, so that the number of the non-decreasing sub lists halves continuously and

1990. nally there is only one such list, which is the sorted result. This idea can be formalized in the following equation. sort(L) = sort 0 (group(L)) (13.35) Where function group(L) groups the list into non-decreasing sub lists. This function can be described like below, the

1991. rst two are trivial edge cases. If the list is empty, the result is a list contains an empty list; If there is only one element in the list, the result is a list contains a singleton list; Otherwise, The

1992. rst two elements are compared, if the

1993. rst one is less than or equal to the second, it is linked in front of the

1994. rst sub list of the recursive grouping result; or a singleton list contains the

1995. rst element is set as the

1996. rst sub list before the recursive result. group(L) = 8 : fLg : jLj 1 ffl1g [ L1;L2; :::g : l1 l2; fL1;L2; :::g = group(L0) ffl1g;L1;L2; :::g : otherwise (13.36) It's quite possible to abstract the grouping criteria as a parameter to develop a generic grouping function, for instance, as the following Haskell code 5. groupBy' :: (a!a!Bool) ![a] ![[a]] groupBy' _ [] = [[]] groupBy' _ [x] = [[x]] groupBy' f (x:xs@(x':_)) j f x x' = (x:ys):yss j otherwise = [x]:r where r@(ys:yss) = groupBy' f xs 5There is a `groupBy' function provided in the Haskell standard library 'Data.List'. How- ever, it doesn't

1997. t here, because it accepts an equality testing function as parameter, which must satisfy the properties of re exive, transitive, and symmetric. but what we use here, the less-than or equal to operation doesn't conform to transitive. Refer to appendix A of this book for detail.

1998. 422CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT Dierent from the sort function, which sorts a list of elements, function sort0 accepts a list of sub lists which is the result of grouping. 0 (L) = sort 8 : : L = L1 : L = fL1g sort0(mergeP airs(L)) : otherwise (13.37) The

1999. rst two are the trivial edge cases. If the list to be sorted is empty, the result is obviously empty; If it contains only one sub list, then we are done. We need just extract this single sub list as result; For the recursive case, we call a function mergeP airs to merge every two sub lists, then recursively call sort0. The next unde

2000. ned function is mergeP airs, as the name indicates, it re-peatedly merges pairs of non-decreasing sub lists into bigger ones. mergeP airs(L) = L : jLj 1 fmerge(L1;L2)g [ mergeP airs(L00) : otherwise (13.38) When there are less than two sub lists in the list, we are done; otherwise, we merge the

2001. rst two sub lists L1 and L2, and recursively merge the rest of pairs in L00. The type of the result of mergeP airs is list of lists, however, it will be attened by sort0 function

2002. nally. The merge function is as same as before. The complete example Haskell program is given as below. mergesort = sort' groupBy' () sort' [] = [] sort' [xs] = xs sort' xss = sort' (mergePairs xss) where mergePairs (xs:ys:xss) = merge xs ys : mergePairs xss mergePairs xss = xss Alternatively, observing that we can

2003. rst pick two sub lists, merge them to an intermediate result, then repeatedly pick next sub list, and merge to this ordered result we've gotten so far until all the rest sub lists are merged. This is a typical folding algorithm as introduced in appendix A. sort(L) = fold(merge;; group(L)) (13.39) Translate this version to Haskell yields the folding version. mergesort' = foldl merge [] groupBy' () Exercise 13.6 Is the nature merge sort algorithm realized by folding is equivalent with the one by using mergeP airs in terms of performance? If yes, prove it; If not, which one is faster? 13.11 Bottom-up merge sort The worst case analysis for nature merge sort raises an interesting topic, instead of realizing merge sort in top-down manner, we can develop a bottom-up version.

2004. 13.11. BOTTOM-UP MERGE SORT 423 The great advantage is that, we needn't do book keeping any more, so the algorithm is quite friendly for purely iterative implementation. The idea of bottom-up merge sort is to turn the sequence to be sorted into n small sub sequences each contains only one element. Then we merge every two of such small sub sequences, so that we get n 2 ordered sub sequences each with length 2; If n is odd number, we left the last singleton sequence untouched. We repeatedly merge these pairs, and

2005. nally we get the sorted result. Knuth names this variant as `straight two-way merge sort' [1]. The bottom-up merge sort is illustrated in

2006. gure 13.19 ... ... ... ... Figure 13.19: Bottom-up merge sort Dierent with the basic version and even-odd version, we needn't explicitly split the list to be sorted in every recursion. The whole list is split into n singletons at the very beginning, and we merge these sub lists in the rest of the algorithm. sort(L) = sort 0 (wraps(L)) (13.40) wraps(L) = : L = ffl1gg [ wraps(L0) : otherwise (13.41) Of course wraps can be implemented by using mapping as introduced in appendix A. 0 (map(x fxg;L)) (13.42) sort(L) = sort We reuse the function sort0 and mergeP airs which are de

2007. ned in section of nature merge sort. They repeatedly merge pairs of sub lists until there is only one. Implement this version in Haskell gives the following example code. sort = sort' map (x![x])

2008. 424CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT This version is based on what Okasaki presented in [6]. It is quite similar to the nature merge sort only diers in the way of grouping. Actually, it can be deduced as a special case (the worst case) of nature merge sort by the following equation. sort(L) = sort 0 (groupBy(x;y False;L)) (13.43) That instead of spanning the non-decreasing sub list as long as possible, the predicate always evaluates to false, so the sub list spans only one element. Similar with nature merge sort, bottom-up merge sort can also be de

2009. ned by folding. The detailed implementation is left as exercise to the reader. Observing the bottom-up sort, we can

2010. nd it's in tail-recursion call manner, thus it's quite easy to translate into purely iterative algorithm without any recursion. 1: function Sort(A) 2: B 3: for 8a 2 A do 4: B Append(fag) 5: N jBj 6: while N 1 do 7: for i from 1 to bN 2 c do 8: B[i] Merge(B[2i 1];B[2i]) 9: if Odd(N) then 10: B[dN 2 e] B[N] 11: N dN 2 e 12: if B = then 13: return 14: return B[1] The following example Python program implements the purely iterative bottom-up merge sort. def mergesort(xs): ys = [[x] for x in xs] while len(ys) 1: ys.append(merge(ys.pop(0), ys.pop(0))) return [] if ys == [] else ys.pop() def merge(xs, ys): zs = [] while xs != [] and ys !=[]: zs.append(xs.pop(0) if xs[0] ys[0] else ys.pop(0)) return zs + (xs if xs !=[] else ys) The Python implementation exploit the fact that instead of starting next round of merging after all pairs have been merged, we can combine these rounds of merging by consuming the pair of lists on the head, and appending the merged result to the tail. This greatly simply the logic of handling odd sub lists case as shown in the above pseudo code. Exercise 13.7 Implement the functional bottom-up merge sort by using folding.

2011. 13.12. PARALLELISM 425 Implement the iterative bottom-up merge sort only with array indexing. Don't use any library supported tools, such as list, vector etc. 13.12 Parallelism We mentioned in the basic version of quick sort, that the two sub sequences can be sorted in parallel after the divide phase

2012. nished. This strategy is also applicable for merge sort. Actually, the parallel version quick sort and morege sort, do not only distribute the recursive sub sequences sorting into two parallel processes, but divide the sequences into p sub sequences, where p is the number of processors. Idealy, if we can achieve sorting in T0 time with parallelism, which satisi

2013. es O(n lg n) = pT 0. We say it is linear speed up, and the algorithm is parallel optimal. However, a straightforward parallel extension to the sequential quick sort algorithm which samples several pivots, divides p sub sequences, and indepen-dantly sorts them in parallel, isn't optimal. The bottleneck exists in the divide phase, which we can only achive O(n) time in average case. The straightforward parallel extention to merge sort, on the other hand, block at the merge phase. Both parallel merge sort and quick sort in practice need good designes in order to achieve the optimal speed up. Actually, the divide and conqure nature makes merge sort and quick sort relative easy for parallelisim. Richard Cole found the O(lg n) parallel merge sort algorithm with n processors in 1986 in [13]. Parallelism is a big and complex topic which is out of the scope of this elementary book. Readers can refer to [13] and [14] for details. 13.13 Short summary In this chapter, two popular divide and conquer sorting methods, quick sort and merge sort are introduced. Both of them meet the upper performance limit of the comparison based sorting algorithms O(n lg n). Sedgewick said that quick sort is the greatest algorithm invented in the 20th century. Almost all programming environments adopt quick sort as the default sorting tool. As time goes on, some environments, especially those manipulate abstract sequence which is dynamic and not based on pure array switch to merge sort as the general purpose sorting tool6. The reason for this interesting phenomena can be partly explained by the treatment in this chapter. That quick sort performs perfectly in most cases, it needs fewer swapping than most other algorithms. However, the quick sort algorithm is based on swapping, in purely functional settings, swapping isn't the most ecient way due to the underlying data structure is singly linked-list, but not vectorized array. Merge sort, on the other hand, is friendly in such environment, as it costs constant spaces, and the performance can be ensured even in the worst case of quick sort, while the latter downgrade to quadratic time. However, merge sort doesn't performs as well as quick sort in purely im-perative settings with arrays. It either needs extra spaces for merging, which is 6Actually, most of them are kind of hybrid sort, balanced with insertion sort to achieve good performance when the sequence is short

2014. 426CHAPTER 13. DIVIDE AND CONQUER, QUICK SORT VS. MERGE SORT sometimes unreasonable, for example in embedded system with limited memory, or causes many overhead swaps by in-place workaround. In-place merging is till an active research area. Although the title of this chapter is `quick sort vs. merge sort', it's not the case that one algorithm has nothing to do with the other. Quick sort can be viewed as the optimized version of tree sort as explained in this chapter. Similarly, merge sort can also be deduced from tree sort as shown in [12]. There are many ways to categorize sorting algorithms, such as in [1]. One way is to from the point of view of easy/hard partition, and easy/hard merge [7]. Quick sort, for example, is quite easy for merging, because all the elements in the sub sequence before the pivot are no greater than any one after the pivot. The merging for quick sort is actually trivial sequence concatenation. Merge sort, on the other hand, is more complex in merging than quick sort. However, it's quite easy to divide no matter what concrete divide method is taken: simple divide at the middle point, even-odd splitting, nature splitting, or bottom-up straight splitting. Compare to merge sort, it's more dicult for quick sort to achieve a perfect dividing. We show that in theory, the worst case can't be completely avoided, no matter what engineering practice is taken, median-of-three, random quick sort, 3-way partition etc. We've shown some elementary sorting algorithms in this book till this chap-ter, including insertion sort, tree sort, selection sort, heap sort, quick sort and merge sort. Sorting is still a hot research area in computer science. At the time when I this chapter is written, people are challenged by the buzz word `big data', that the traditional convenient method can't handle more and more huge data within reasonable time and resources. Sorting a sequence of hundreds of Gigabytes becomes a routine in some

2015. elds. Exercise 13.8 Design an algorithm to create binary search tree by using merge sort strategy.

2016. Bibliography [1] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [3] Robert Sedgewick. Implementing quick sort programs. Communication of ACM. Volume 21, Number 10. 1978. pp.847 - 857. [4] Jon Bentley. Programming pearls, Second Edition. Addison-Wesley Pro-fessional; 1999. ISBN-13: 978-0201657883 [5] Jon Bentley, Douglas McIlroy. Engineering a sort function. Software Practice and experience VOL. 23(11), 1249-1265 1993. [6] Robert Sedgewick, Jon Bentley. Quicksort is optimal. http://www.cs.princeton.edu/ rs/talks/QuicksortIsOptimal.pdf [7] Richard Bird. Pearls of functional algorithm design. Cambridge Univer-sity Press. 2010. ISBN, 1139490605, 9781139490603 [8] Fethi Rabhi, Guy Lapalme. Algorithms: a functional programming ap-proach. Second edition. Addison-Wesley, 1999. ISBN: 0201-59604-0 [9] Simon Peyton Jones. The Implementation of functional programming lan-guages. Prentice-Hall International, 1987. ISBN: 0-13-453333-X [10] Jyrki Katajainen, Tomi Pasanen, Jukka Teuhola. Practical in-place merge-sort. Nordic Journal of Computing, 1996. [11] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [12] Jose Bacelar Almeida and Jorge Sousa Pinto. Deriving Sorting Algo-rithms. Technical report, Data structures and Algorithms. 2008. [13] Cole, Richard (August 1988). Parallel merge sort. SIAM J. Comput. 17 (4): 770C785. doi:10.1137/0217049. (August 1988) [14] Powers, David M. W. Parallelized Quicksort and Radixsort with Optimal Speedup, Proceedings of International Conference on Parallel Computing Technologies. Novosibirsk. 1991. 427

2017. 428 Searching [15] Wikipedia. Quicksort. http://en.wikipedia.org/wiki/Quicksort [16] Wikipedia. Strict weak order. http://en.wikipedia.org/wiki/Strict weak order [17] Wikipedia. Total order. http://en.wokipedia.org/wiki/Total order [18] Wikipedia. Harmonic series (mathematics). http://en.wikipedia.org/wiki/Harmonic series (mathematics)

2018. Chapter 14 Searching 14.1 Introduction Searching is quite a big and important area. Computer makes many hard search-ing problems realistic. They are almost impossible for human beings. A modern industry robot can even search and pick the correct gadget from the pipeline for assembly; A GPS car navigator can search among the map, for the best route to a speci

2019. c place. The modern mobile phone is not only equipped with such map navigator, but it can also search for the best price for Internet shopping. This chapter just scratches the surface of elementary searching. One good thing that computer oers is the brute-force scanning for a certain result in a large sequence. The divide and conquer search strategy will be briefed with two problems, one is to

2020. nd the k-th big one among a list of unsorted elements; the other is the popular binary search among a list of sorted elements. We'll also introduce the extension of binary search for multiple-dimension data. Text matching is also very important in our daily life, two well-known search-ing algorithms, Knuth-Morris-Pratt (KMP) and Boyer-Moore algorithms will be introduced. They set good examples for another searching strategy: information reusing. Besides sequence search, some elementary methods for searching solution for some interesting problems will be introduced. They were mostly well studied in the early phase of AI (arti

2021. cial intelligence), including the basic DFS (Depth

2022. rst search), and BFS (Breadth

2023. rst search). Finally, Dynamic programming will be briefed for searching optimal solu-tions, and we'll also introduce about greedy algorithm which is applicable for some special cases. All algorithms will be realized in both imperative and functional approaches. 14.2 Sequence search Although modern computer oers fast speed for brute-force searching, and even if the Moore's law could be strictly followed, the grows of huge data is too fast to be handled well in this way. We've seen a vivid example in the introduction chapter of this book. It's why people study the computer search algorithms. 429

2024. 430 CHAPTER 14. SEARCHING 14.2.1 Divide and conquer search One solution is to use divide and conquer approach. That if we can repeatedly scale down the search domain, the data being dropped needn't be examined at all. This will de

2025. nitely speed up the search. k-selection problem Consider a problem of

2026. nding the k-th smallest one among n elements. The most straightforward idea is to

2027. nd the minimum

2028. rst, then drop it and

2029. nd the second minimum element among the rest. Repeat this minimum

2030. nding and dropping k steps will give the k-th smallest one. Finding the minimum among n elements costs linear O(n) time. Thus this method performs O(kn) time, if k is much smaller than n. Another method is to use the `heap' data structure we've introduced. No matter what concrete heap is used, e.g. binary heap with implicit array, Fi-bonacci heap or others, Accessing the top element followed by popping is typ-ically bound O(lg n) time. Thus this method, as formalized in equation (14.1) and (14.2) performs in O(k lg n) time, if k is much smaller than n. top(k;L) = find(k; heapify(L)) (14.1) find(k;H) = top(H) : k = 0 find(k 1; pop(H)) : otherwise (14.2) However, heap adds some complexity to the solution. Is there any simple, fast method to

2031. nd the k-th element? The divide and conquer strategy can help us. If we can divide all the elements into two sub lists A and B, and ensure all the elements in A is not greater than any elements in B, we can scale down the problem by following this method1: 1. Compare the length of sub list A and k; 2. If k jAj, the k-th smallest one must be contained in A, we can drop B and further search in A; 3. If jAj k, the k-th smallest one must be contained in B, we can drop A and further search the (k jAj)-th smallest one in B. Note that the italic font emphasizes the fact of recursion. The ideal case always divides the list into two equally big sub lists A and B, so that we can halve the problem each time. Such ideal case leads to a performance of O(n) linear time. Thus the key problem is how to realize dividing, which collects the

2032. rst m smallest elements in one sub list, and put the rest in another. This reminds us the partition algorithm in quick sort, which moves all the elements smaller than the pivot in front of it, and moves those greater than the pivot behind it. Based on this idea, we can develop a divide and conquer k-selection algorithm, which is called quick selection algorithm. 1This actually demands a more accurate de

2033. nition of the k-th smallest in L: It's equal to the k-the element of L0, where L0 is a permutation of L, and L0 is in monotonic non-decreasing order.

2034. 14.2. SEQUENCE SEARCH 431 1. Randomly select an element (the

2035. rst for instance) as the pivot; 2. Moves all elements which aren't greater than the pivot in a sub list A; and moves the rest to sub list B; 3. Compare the length of A with k, if jAj = k 1, then the pivot is the k-th smallest one; 4. If jAj k 1, recursively

2036. nd the k-th smallest one among A; 5. Otherwise, recursively

2037. nd the (k jAj)-th smallest one among B; This algorithm can be formalized in below equation. Suppose 0 k jLj , where L is a non-empty list of elements. Denote l1 as the

2038. rst element in L. It is chosen as the pivot; L0 contains the rest elements except for l1. (A;B) = partition(x x l1;L0). It partitions L0 by using the same algorithm de

2039. ned in the chapter of quick sort. top(k;L) = 8 : l1 : jAj = k 1 top(k 1 jAj;B) : jAj k 1 top(k;A) : otherwise (14.3) partition(p;L) = 8 : (; ) : L = (fl1g [ A;B) : p(l1); (A;B) = partition(p;L0) (A; fl1g [ B) : :p(l1) (14.4) The following Haskell example program implements this algorithm. top n (x:xs) j len == n - 1 = x j len n - 1 = top (n - len - 1) bs j otherwise = top n as where (as, bs) = partition (x) xs len = length as The partition function is provided in Haskell standard library, the detailed implementation can be referred to previous chapter about quick sort. The lucky case is that, the k-th smallest element is selected as the pivot at the very beginning. The partition function examines the whole list, and

2040. nds that there are k 1 elements not greater than the pivot, we are done in just O(n) time. The worst case is that either the maximum or the minimum element is selected as the pivot every time. The partition always produces an empty sub list, that either A or B is empty. If we always pick the minimum as the pivot, the performance is bound to O(kn). If we always pick the maximum as the pivot, the performance is O((n k)n). If k is much less than n, it downgrades to quadratic O(n2) time. The best case (not the lucky case), is that the pivot always partition the list perfectly. The length of A is nearly as same as the length of B. The list is halved every time. It needs about O(lg n) partitions, each partition takes linear time proportion to the length of the halved list. This can be expressed as O(n + n 2 + n 4 + ::: + n 2m ), where m is the smallest number satis

2041. es n 2m k. Summing the series leads to the result of O(n).

2042. 432 CHAPTER 14. SEARCHING The average case analysis needs tool of mathematical expectation. It's quite similar to the proof given in previous chapter of quick sort. It's left as an exercise to the reader. Similar as quick sort, this divide and conquer selection algorithm performs well most time in practice. We can take the same engineering practice such as media-of-three, or randomly select the pivot as we did for quick sort. Below is the imperative realization for example. 1: function Top(k; A; l; u) 2: Exchange A[l] $ A[ Random(l; u) ] . Randomly select in [l; u] 3: p Partition(A; l; u) 4: if p l + 1 = k then 5: return A[p] 6: if k p l + 1 then 7: return Top(k; A; l; p 1) 8: return Top(k p + l 1; A; p + 1; u) This algorithm searches the k-th smallest element in range of [l; u] for array A. The boundaries are included. It

2043. rst randomly selects a position, and swaps it with the

2044. rst one. Then this element is chosen as the pivot for partitioning. The partition algorithm in-place moves elements and returns the position where the pivot being moved. If the pivot is just located at position k, then we are done; if there are more than k 1 elements not greater than the pivot, the algorithm recursively searches the k-th smallest one in range [l; p1]; otherwise, k is deduced by the number of elements before the pivot, and recursively searches the range after the pivot [p + 1; u]. There are many methods to realize the partition algorithm, below one is based on N. Lumoto's method. Other realizations are left as exercises to the reader. 1: function Partition(A, l, u) 2: p A[l] 3: L l 4: for R l + 1 to u do 5: if :(p A[R]) then 6: L L + 1 7: Exchange A[L] $ A[R] 8: Exchange A[L] $ p 9: return L Below ANSI C example program implements this algorithm. Note that it handles the special case that either the array is empty, or k is out of the bound-aries of the array. It returns -1 to indicate the search failure. int partition(Key xs, int l, int u) { int r, p = l; for (r = l + 1; r u; ++r) if (!(xs[p] xs[r])) swap(xs, ++l, r); swap(xs, p, l); return l; } = The result is stored in xs[k], returns k if u-l k, otherwise -1 =

2045. 14.2. SEQUENCE SEARCH 433 int top(int k, Key xs, int l, int u) { int p; if (l u) { swap(xs, l, rand() % (u - l) + l); p = partition(xs, l, u); if (p - l + 1 == k) return p; return (k p - l + 1) ? top(k, xs, l, p) : top(k- p + l - 1, xs, p + 1, u); } return -1; } There is a method proposed by Blum, Floyd, Pratt, Rivest and Tarjan in 1973, which ensures the worst case performance being bound to O(n) [2], [3]. It divides the list into small groups. Each group contains no more than 5 elements. The median of each group among these 5 elements are identi

2046. ed quickly. Then there are n 5 median elements selected. We repeat this step, and divide them again into groups of 5, and recursively select the median of median. It's obviously that the

2047. nal `true' median can be found in O(lg n) time. This is the best pivot for partitioning the list. Next, we halve the list by this pivot and recursively search for the k-th smallest one. The performance can be calculated as the following. T(n) = c1lgn + c2n + T( n 2 ) (14.5) Where c1 and c2 are constant factors for the median of median and partition computation respectively. Solving this equation with telescope method or the master theory in [2] gives the linear O(n) performance. The detailed algorithm realization is left as exercise to the reader. In case we just want to pick the top k smallest elements, but don't care about the order of them, the algorithm can be adjusted a little bit to

2048. t. tops(k;L) = 8 : : k = 0 _ L = A : jAj = k A [ fl1g [ tops(k jAj 1;B) : jAj k tops(k;A) : otherwise (14.6) Where A, B have the same meaning as before that, (A;B) = partition(x x l1;L0) if L isn't empty. The relative example program in Haskell is given as below. tops _ [] = [] tops 0 _ = [] tops n (x:xs) j len ==n = as j len n = as ++ [x] ++ tops (n-len-1) bs j otherwise = tops n as where (as, bs) = partition ( x) xs len = length as

2049. 434 CHAPTER 14. SEARCHING binary search Another popular divide and conquer algorithm is binary search. We've shown it in the chapter about insertion sort. When I was in school, the teacher who taught math played a magic to me, He asked me to consider a natural number less than 1000. Then he asked me some questions, I only replied `yes' or `no', and

2050. nally he guessed my number. He typically asked questions like the following: Is it an even number? Is it a prime number? Are all digits same? Can it be divided by 3? ... Most of the time he guessed the number within 10 questions. My classmates and I all thought it's unbelievable. This game will not be so interesting if it downgrades to a popular TV pro-gram, that the price of a product is hidden, and you must

2051. gure out the exact price in 30 seconds. The host of the program tells you if your guess is higher or lower to the fact. If you win, the product is yours. The best strategy is to use similar divide and conquer approach to perform a binary search. So it's common to

2052. nd such conversation between the player and the host: P: 1000; H: High; P: 500; H: Low; P: 750; H: Low; P: 890; H: Low; P: 990; H: Bingo. My math teacher told us that, because the number we considered is within 1000, if he can halve the numbers every time by designing good questions, the number will be found in 10 questions. This is because 210 = 1024 1000. However, it would be boring to just ask it is higher than 500, is lower than 250, ... Actually, the question `is it even' is very good, because it always halve the numbers. Come back to the binary search algorithm. It is only applicable to a sequence of ordered number. I've seen programmers tried to apply it to unsorted array, and took several hours to

2053. gure out why it doesn't work. The idea is quite

2054. 14.2. SEQUENCE SEARCH 435 straightforward, in order to

2055. nd a number x in an ordered sequence A, we

2056. rstly check middle point number, compare it with x, if they are same, then we are done; If x is smaller, as A is ordered, we need only recursively search it among the

2057. rst half; otherwise we search it among the second half. Once A gets empty and we haven't found x yet, it means x doesn't exist. Before formalizing this algorithm, there is a surprising fact need to be noted. Donald Knuth stated that `Although the basic idea of binary search is compar-atively straightforward, the details can be surprisingly tricky'. Jon Bentley pointed out that most binary search implementation contains errors, and even the one given by him in the

2058. rst version of `Programming pearls' contains an error undetected over twenty years [4]. There are two kinds of realization, one is recursive, the other is iterative. The recursive solution is as same as what we described. Suppose the lower and upper boundaries of the array are l and u inclusive. 1: function Binary-Search(x; A; l; u) 2: if u l then 3: Not found error 4: else 5: m l + b ul 2 c . avoid over ow of b l+u 2 c 6: if A[m] = x then 7: return m 8: if x A[m] then 9: return Binary-Search(x, A, l, m - 1) 10: else 11: return Binary-Search(x, A, m + 1, u) As the comment highlights, if the integer is represented with limited words, we can't merely use b l+u 2 c because it may cause over ow if l and u are big. Binary search can also be realized in iterative manner, that we keep updating the boundaries according to the middle point comparison result. 1: function Binary-Search(x; A; l; u) 2: while l u do 3: m l + b ul 2 c 4: if A[m] = x then 5: return m 6: if x A[m] then 7: u m 1 8: else 9: l m + 1 return NIL The implementation is very good exercise, we left it to the reader. Please try all kinds of methods to verify your program. Since the array is halved every time, the performance of binary search is bound to O(lg n) time. In purely functional settings, the list is represented with singly linked-list. It's linear time to randomly access the element for a given position. Binary search doesn't make sense in such case. However, it good to analyze what the performance will downgrade to. Consider the following equation.

2059. 436 CHAPTER 14. SEARCHING bsearch(x;L) = 8 : Err : L = b1 : x = b1; (A;B) = splitAt(b jLj 2 c;L) bsearch(x;A) : B = _ x b1 bsearch(x;B0) : otherwise Where b1 is the

2060. rst element if B isn't empty, and B0 holds the rest except for b1. The splitAt function takes O(n) time to divide the list into two subs A and B (see the appendix A, and the chapter about merge sort for detail). If B isn't empty and x is equal to b1, the search returns; Otherwise if it is less than b1, as the list is sorted, we need recursively search in A, otherwise, we search in B. If the list is empty, we raise error to indicate search failure. As we always split the list in the middle point, the number of elements halves in each recursion. In every recursive call, we takes linear time for splitting. The splitting function only traverses the

2061. rst half of the linked-list, Thus the total time can be expressed as. T(n) = c n 2 + c n 4 + c n 8 + ::: This results O(n) time, which is as same as the brute force search from head to tail: search(x;L) = 8 : Err : L = l1 : x = l1 search(x;L0) : otherwise As we mentioned in the chapter about insertion sort, the functional approach of binary search is through binary search tree. That the ordered sequence is represented in a tree ( self balanced tree if necessary), which oers logarithm time searching 2. Although it doesn't make sense to apply divide and conquer binary sort on linked-list, binary search can still be very useful in purely functional settings. Consider solving an equation ax = y, for given natural numbers a and y, where a y. We want to

2062. nd the integer solution for x if there is. Of course brute-force naive searching can solve it. We can examine all numbers one by one from 0 for a0; a1; a2; :::, stops if ai = y or report that there is no solution if ai y ai+1 for some i. We initialize the solution domain as X = f0; 1; 2; :::g, and call the below exhausted searching function solve(a; y;X). solve(a; y;X) = 8 : x1 : ax1 = y solve(a; y;X0) : ax1 y Err : otherwise This function examines the solution domain in monotonic increasing order. It takes the

2063. rst candidate element x1 from X, compare ax1 and y, if they are equal, then x1 is the solution and we are done; if it is less than y, then x1 is dropped, and we search among the rest elements represented as X0; Otherwise, since f(x) = ax is non-decreasing function when a is natural number, so the rest 2Some readers may argue that array should be used instead of linked-list, for example in Haskell. This book only deals with purely functional sequences in

2064. nger-tree. Dierent from the Haskell array, it can't support constant time random accessing

2065. 14.2. SEQUENCE SEARCH 437 elements will only make f(x) bigger and bigger. There is no integer solution for this equation. The function returns error to indicate no solution. The computation of ax is expensive for big a and x if precession must be kept3. Can it be improved so that we can compute as less as possible? The divide and conquer binary search can help. Actually, we can estimate the upper limit of the solution domain. As ay y, We can search in range f0; 1; :::; yg. As the function f(x) = ax is non-decreasing against its argument x, we can

2066. rstly check the middle point candidate xm = b0+y 2 c, if axm = y, the solution is found; if it is less than y, we can drop all candidate solutions before xm; otherwise we drop all candidate solutions after it; Both halve the solution domain. We repeat this approach until either the solution is found or the solution domain becomes empty, which indicates there is no integer solution. The binary search method can be formalized as the following equation. The non-decreasing function is abstracted as a parameter. To solve our problem, we can just call it as bsearch(f; y; 0; y), where f(x) = ax. bsearch(f; y; l; u) = 8 : Err : u l m : f(m) = y;m = b l+u 2 c bsearch(f; y; l;m 1) : f(m) y bsearch(f; y;m + 1; u) : f(m) y (14.7) As we halve the solution domain in every recursion, this method computes f(x) in O(log y) times. It is much faster than the brute-force searching. 2 dimensions search It's quite natural to think that the idea of binary search can be extended to 2 dimensions or even more general { multiple-dimensions domain. However, it is not so easy. Consider the example of a m n matrix M. The elements in each row and each column are in strict increasing order. Figure 14.1 illustrates such a matrix for example. 2 664 1 2 3 4 ::: 2 4 5 6 ::: 3 5 7 8 ::: 4 6 8 9 :::::: 3 775 Figure 14.1: A matrix in strict increasing order for each row and column. Given a value x, how to locate all elements equal to x in the matrix quickly? We need develop an algorithm, which returns a list of locations (i; j) so that Mi;j = x. Richard Bird in [1] mentioned that he used this problem to interview candi-dates for entry to Oxford. The interesting story was that, those who had some 3One alternative is to reuse the result of an when compute an+1 = aan. Here we consider for general form monotonic function f(n)

2067. 438 CHAPTER 14. SEARCHING computer background at school tended to use binary search. But it's easy to get stuck. The usual way follows binary search idea is to examine element at Mm 2 ; n 2 . If it is less than x, we can only drop the elements in the top-left area; If it is greater than x, only the bottom-right area can be dropped. Both cases are illustrated in

2068. gure 14.2, the gray areas indicate elements can be dropped. Figure 14.2: Left: the middle point element is smaller than x. All elements in the gray area are less than x; Right: the middle point element is greater than x. All elements in the gray area are greater than x. The problem is that the solution domain changes from a rectangle to a 'L' shape in both cases. We can't just recursively apply search on it. In order to solve this problem systematically, we de

2069. ne the problem more generally, using brute-force search as a start point, and keep improving it bit by bit. Consider a function f(x; y), which is strict increasing for its arguments, for instance f(x; y) = ax + by, where a and b are natural numbers. Given a value z, which is a natural number too, we want to solve the equation f(x; y) = z by

2070. nding all candidate pairs (x; y). With this de

2071. nition, the matrix search problem can be specialized by below function. f(x; y) = Mx;y : 1 x m; 1 y n 1 : otherwise Brute-force 2D search As all solutions should be found for f(x; y). One can immediately give the brute force solution by embedded looping. 1: function Solve(f; z) 2: A 3: for x 2 f0; 1; 2; :::; zg do 4: for y 2 f0; 1; 2; :::; zg do 5: if f(x; y) = z then 6: A A [ f(x; y)g 7: return A

2072. 14.2. SEQUENCE SEARCH 439 This de

2073. nitely calculates f for (z + 1)2 times. It can be formalized as in (14.8). solve(f; z) = f(x; y)jx 2 f0; 1; :::; zg; y 2 f0; 1; :::; zg; f(x; y) = zg (14.8) Saddleback search We haven't utilize the fact that f(x; y) is strict increasing yet. Dijkstra pointed out in [6], instead of searching from bottom-left corner, starting from the top-left leads to one eective solution. As illustrated in

2074. gure 14.3, the search starts from (0; z), for every point (p; q), we compare f(p; q) with z: If f(p; q) z, since f is strict increasing, for all 0 y q, we have f(p; y) z. We can drop all points in the vertical line section (in red color); If f(p; q) z, then f(x; q) z for all p x z. We can drop all points in the horizontal line section (in blue color); Otherwise if f(p; q) = z, we mark (p; q) as one solution, then both line sections can be dropped. This is a systematical way to scale down the solution domain rectangle. We keep dropping a row, or a column, or both. Figure 14.3: Search from top-left. This method can be formalized as a function search(f; z; p; q), which searches solutions for equation f(x; y) = z in rectangle with top-left corner (p; q), and bottom-right corner (z; 0). We start the searching by initializing (p; q) = (0; z) as solve(f; z) = search(f; z; 0; z) search(f; z; p; q) = 8 : : p z _ q 0 search(f; z; p + 1; q) : f(p; q) z search(f; z; p; q 1) : f(p; q) z f(p; q)g [ search(f; z; p + 1; q 1) : otherwise (14.9)

2075. 440 CHAPTER 14. SEARCHING The

2076. rst clause is the edge case, there is no solution if (p; q) isn't top-left to (z; 0). The following example Haskell program implements this algorithm. solve f z = search 0 z where search p q j p z j j q 0 = [] j z' z = search (p + 1) q j z' z = search p (q - 1) j otherwise = (p, q) : search (p + 1) (q - 1) where z' = f p q Considering the calculation of f may be expensive, this program stores the result of f(p; q) to variable z0. This algorithm can also be implemented in iterative manner, that the boundaries of solution domain keeps being updated in a loop. 1: function Solve(f; z) 2: p 0; q z 3: S 4: while p z ^ q 0 do 5: z0 f(p; q) 6: if z0 z then 7: p p + 1 8: else if z0 z then 9: q q 1 10: else 11: S S [ f(p; q)g 12: p p + 1; q q 1 13: return S It's intuitive to translate this imperative algorithm to real program, as the following example Python code. def solve(f, z): (p, q) = (0, z) res = [] while p z and q 0: z1 = f(p, q) if z1 z: p = p + 1 elif z1 z: q = q - 1 else: res.append((p, q)) (p, q) = (p + 1, q - 1) return res It is clear that in every iteration, At least one of p and q advances to the bottom-right corner by one. Thus it takes at most 2(z + 1) steps to complete searching. This is the worst case. There are three best cases. The

2077. rst one happens that in every iteration, both p and q advance by one, so that it needs only z+1 steps; The second case keeps advancing horizontally to right and ends when p exceeds z; The last case is similar, that it keeps moving down vertically to the bottom until q becomes negative. Figure 14.4 illustrates the best cases and the worst cases respectively. Figure 14.4 (a) is the case that every point (x; zx) in diagonal satis

2078. es f(x; zx) = z,

2079. 14.2. SEQUENCE SEARCH 441 it uses z+1 steps to arrive at (z; 0); (b) is the case that every point (x; z) along the top horizontal line gives the result f(x; z) z, the algorithm takes z + 1 steps to

2080. nish; (c) is the case that every point (0; x) along the left vertical line gives the result f(0; x) z, thus the algorithm takes z +1 steps to

2081. nish; (d) is the worst case. If we project all the horizontal sections along the search path to x axis, and all the vertical sections to y axis, it gives the total steps of 2(z +1). Figure 14.4: The best cases and the worst cases. Compare to the quadratic brute-force method (O(z2)), we improve to a linear algorithm bound to O(z). Bird imagined that the name `saddleback' is because the 3D plot of f with the smallest bottom-left and the latest top-right and two wings looks like a saddle as shown in

2082. gure 14.5 Figure 14.5: Plot of f(x; y) = x2 + y2. Improved saddleback search We haven't utilized the binary search tool so far, even the problem extends to 2-dimension domain. The basic saddleback search starts from the top-left corner

2083. 442 CHAPTER 14. SEARCHING (0; z) to the bottom-right corner (z; 0). This is actually over-general domain. we can constraint it a bit more accurate. Since f is strict increasing, we can

2084. nd the biggest number m, that 0 m z, along the y axis which satis

2085. es f(0;m) z; Similarly, we can

2086. nd the biggest n, that 0 n z, along the x axis, which satis

2087. es f(n; 0) z; And the solution domain shrinks from (0; z) (z; 0) to (0;m) (n; 0) as shown in

2088. gure 14.6. Figure 14.6: A more accurate search domain shown in gray color. Of course m and n can be found by brute-force like below. m = max(fyj0 y z; f(0; y) zg) n = max(fxj0 x z; f(x; 0) zg) (14.10) When searching m, the x variable of f is bound to 0. It turns to be one dimension search problem for a strict increasing function (or in functional term, a Curried function f(0; y)). Binary search works in such case. However, we need a bit modi

2089. cation for equation (14.7). Dierent from searching a solution l x u, so that f(x) = y for a given y; we need search for a solution l x u so that f(x) y f(x + 1). bsearch(f; y; l; u) = 8 : l : u l m : f(m) y f(m + 1);m = b l+u 2 c bsearch(f; y;m + 1; u) : f(m) y bsearch(f; y; l;m 1) : otherwise (14.11) The

2090. rst clause handles the edge case of empty range. The lower boundary is returned in such case; If the middle point produces a value less than or equal to the target, while the next one evaluates to a bigger value, then the middle point is what we are looking for; Otherwise if the point next to the middle also evaluates to a value not greater than the target, the lower bound is increased by one, and we perform recursively binary search; In the last case, the middle point evaluates to a value greater than the target, upper bound is updated as the point proceeds to the middle for further recursive searching. The following Haskell example code implements this modi

2091. ed binary search. bsearch f y (l, u) j u l = l

2092. 14.2. SEQUENCE SEARCH 443 j f m y = if f (m + 1) y then bsearch f y (m + 1, u) else m j otherwise = bsearch f y (l, m-1) where m = (l + u) `div` 2 Then m and n can be found with this binary search function. m = bsearch(y f(0; y); z; 0; z) n = bsearch(x f(x; 0); z; 0; z) (14.12) And the improved saddleback search shrinks to this new search domain solve(f; z) = search(f; z; 0;m): search(f; z; p; q) = 8 : : p n _ q 0 search(f; z; p + 1; q) : f(p; q) z search(f; z; p; q 1) : f(p; q) z f(p; q)g [ search(f; z; p + 1; q 1) : otherwise (14.13) It's almost as same as the basic saddleback version, except that it stops if p exceeds n, but not z. In real implementation, the result of f(p; q) can be calculated once, and stored in a variable as shown in the following Haskell example. solve' f z = search 0 m where search p q j p n j j q 0 = [] j z' z = search (p + 1) q j z' z = search p (q - 1) j otherwise = (p, q) : search (p + 1) (q - 1) where z' = f p q m = bsearch (f 0) z (0, z) n = bsearch (x!f x 0) z (0, z) This improved saddleback search

2093. rstly performs binary search two rounds to

2094. nd the proper m, and n. Each round is bound to O(lg z) times of calculation for f; After that, it takes O(m + n) time in the worst case; and O(min(m; n)) time in the best case. The overall performance is given in the following table. times of evaluation f worst case 2 log z + m + n best case 2 log z + min(m; n) For some function f(x; y) = ax + by, for positive integers a and b, m and n will be relative small, that the performance is close to O(lg z). This algorithm can also be realized in imperative approach. Firstly, the binary search should be modi

2095. ed. 1: function Binary-Search(f; y; (l; u)) 2: while l u do 3: m b l+u 2 c 4: if f(m) y then 5: if y f(m + 1) then 6: return m 7: l m + 1 8: else 9: u m

2096. 444 CHAPTER 14. SEARCHING 10: return l Utilize this algorithm, the boundaries m and n can be found before perform-ing the saddleback search. 1: function Solve(f; z) 2: m Binary-Search(y f(0; y); z; (0; z)) 3: n Binary-Search(x f(x; 0); z; (0; z)) 4: p 0; q m 5: S 6: while p n ^ q 0 do 7: z0 f(p; q) 8: if z0 z then 9: p p + 1 10: else if z0 z then 11: q q 1 12: else 13: S S [ f(p; q)g 14: p p + 1; q q 1 15: return S The implementation is left as exercise to the reader. More improvement to saddleback search In

2097. gure 14.2, two cases are shown for comparing the value of the middle point in a matrix with the given value. One case is the center value is smaller than the given value, the other is bigger. In both cases, we can only throw away 1 4 candidates, and left a 'L' shape for further searching. Actually, one important case is missing. We can extend the observation to any point inside the rectangle searching area. As shown in the

2098. gure 14.7. Suppose we are searching in a rectangle from the lower-left corner (a; b) to the upper-right corner (c; d). If the (p; q) isn't the middle point, and f(p; q)6= z. We can't ensure the area to be dropped is always 1/4. However, if f(p; q) = z, as f is strict increasing, we are not only sure both the lower-left and the upper-right sub areas can be thrown, but also all the other points in the column p and row q. The problem can be scaled down fast, because only 1/2 area is left. This indicates us, instead of jumping to the middle point to start searching. A more ecient way is to

2099. nd a point which evaluates to the target value. One straightforward way to

2100. nd such a point, is to perform binary search along the center horizontal line or the center vertical line of the rectangle. The performance of binary search along a line is logarithmic to the length of that line. A good idea is to always pick the shorter center line as shown in

2101. gure 14.8. That if the height of the rectangle is longer than the width, we perform binary search along the horizontal center line; otherwise we choose the vertical center line. However, what if we can't

2102. nd a point (p; q) in the center line, that satis

2103. es f(p; q) = z? Let's take the center horizontal line for example. even in such case, we can still

2104. nd a point that f(p; q) z f(p + 1; q). The only dierence is that we can't drop the points in row p and q completely. Combine this conditions, the binary search along the horizontally line is to

2105. nd a p, satis

2106. es f(p; q) z f(p + 1; q); While the vertical line search

2107. 14.2. SEQUENCE SEARCH 445 (a) If f(p; q)6= z, only lower-left or upper-right sub area (in gray color) can be thrown. Both left a 'L' shape. (b) If f(p; q) = z, both sub areas can be thrown, the scale of the problem is halved. Figure 14.7: The eciency of scaling down the search domain. Figure 14.8: Binary search along the shorter center line.

2108. 446 CHAPTER 14. SEARCHING condition is f(p; q) z f(p; q + 1). The modi

2109. ed binary search ensures that, if all points in the line segment give f(p; q) z, the upper bound will be found; and the lower bound will be found if they all greater than z. We can drop the whole area on one side of the center line in such case. Sum up all the ideas, we can develop the ecient improved saddleback search as the following. 1. Perform binary search along the y axis and x axis to

2110. nd the tight bound-aries from (0;m) to (n; 0); 2. Denote the candidate rectangle as (a; b)(c; d), if the candidate rectangle is empty, the solution is empty; 3. If the height of the rectangle is longer than the width, perform binary search along the center horizontal line; otherwise, perform binary search along the center vertical line; denote the search result as (p; q); 4. If f(p; q) = z, record (p; q) as a solution, and recursively search two sub rectangles (a; b) (p 1; q + 1) and (p + 1; q 1) (c; d); 5. Otherwise, f(p; q)6= z, recursively search the same two sub rectangles plus a line section. The line section is either (p; q +1)(p; b) as shown in

2111. gure 14.9 (a); or (p + 1; q) (c; q) as shown in

2112. gure 14.9 (b). Figure 14.9: Recursively search the gray areas, the bold line should be included if f(p; q)6= z. This algorithm can be formalized as the following. The equation (14.11), and (14.12) are as same as before. A new search function should be de

2113. ned. De

2114. ne Search(a;b);(c;d) as a function for searching rectangle with top-left corner (a; b), and bottom-right corner (c; d). search(a;b);(c;d) = 8 : : c a _ d b csearch : c a b d rsearch : otherwise (14.14) Function csearch performs binary search in the center horizontal line to

2115. nd a point (p; q) that f(p; q) z f(p + 1; q). This is shown in

2116. gure 14.9

2117. 14.2. SEQUENCE SEARCH 447 (a). There is a special edge case, that all points in the lines evaluate to values greater than z. The general binary search will return the lower bound as result, so that (p; q) = (a; b b+d 2 c). The whole upper side includes the center line can be dropped as shown in

2118. gure 14.10 (a). Figure 14.10: Edge cases when performing binary search in the center line. csearch = 8 : search(p;q1);(c;d) : z f(p; q) search(a;b);(p1;q+1) [ f(p; q)g [ search(p+1;q1);(c;d) : f(p; q) = z search(a;b);(p;q+1) [ search(p+1;q1);(c;d) : otherwise (14.15) Where q = b b+d 2 c) p = bsearch(x f(x; q); z; (a; c)) Function rsearch is quite similar except that it searches in the center hori-zontal line. rsearch = 8 : search(a;b);(p1;q) : z f(p; q) search(a;b);(p1;q+1) [ f(p; q)g [ search(p+1;q1);(c;d) : f(p; q) = z search(a;b);(p1;q+1) [ search(p+1;q);(c;d) : otherwise (14.16) Where p = b a+c 2 c) q = bsearch(y f(p; y); z; (d; b)) The following Haskell program implements this algorithm. search f z (a, b) (c, d) j c a j j b d = [] j c - a b - d = let q = (b + d) `div` 2 in csearch (bsearch (x ! f x q) z (a, c), q) j otherwise = let p = (a + c) `div` 2 in rsearch (p, bsearch (f p) z (d, b)) where csearch (p, q) j z f p q = search f z (p, q - 1) (c, d)

2119. 448 CHAPTER 14. SEARCHING j f p q == z = search f z (a, b) (p - 1, q + 1) ++ (p, q) : search f z (p + 1, q - 1) (c, d) j otherwise = search f z (a, b) (p, q + 1) ++ search f z (p + 1, q - 1) (c, d) rsearch (p, q) j z f p q = search f z (a, b) (p - 1, q) j f p q == z = search f z (a, b) (p - 1, q + 1) ++ (p, q) : search f z (p + 1, q - 1) (c, d) j otherwise = search f z (a, b) (p - 1, q + 1) ++ search f z (p + 1, q) (c, d) And the main program calls this function after performing binary search in X and Y axes. solve f z = search f z (0, m) (n, 0) where m = bsearch (f 0) z (0, z) n = bsearch (x ! f x 0) z (0, z) Since we drop half areas in every recursion, it takes O(log(mn)) rounds of search. However, in order to locate the point (p; q), which halves the problem, we must perform binary search along the center line. which will call f about O(log(min(m; n))) times. Denote the time of searching a m n rectangle as T(m; n), the recursion relationship can be represented as the following. T(m; n) = log(min(m; n)) + 2T( m 2 ; n 2 ) (14.17) Suppose m n, using telescope method, for m = 2i, and n = 2j . We have: T(2i; 2j) = j + 2T(2i1; 2j1) = Xi1 k=0 2k(j k) = O(2i(j i)) = O(mlog(n=m)) (14.18) Richard Bird proved that this is asymptotically optimal by a lower bound of searching a given value in m n rectangle [1]. The imperative algorithm is almost as same as the functional version. We skip it for the sake of brevity. Exercise 14.1 Prove that the average case for the divide and conquer solution to k- selection problem is O(n). Please refer to previous chapter about quick sort. Implement the imperative k-selection problem with 2-way partition, and median-of-three pivot selection. Implement the imperative k-selection problem to handle duplicated ele-ments eectively. Realize the median-of-median k-selection algorithm and implement it in your favorite programming language.

2120. 14.2. SEQUENCE SEARCH 449 The tops(k;L) algorithm uses list concatenation likes A [ fl1g [ tops(k jAj 1;B). It is linear operation which is proportion to the length of the list to be concatenated. Modify the algorithm so that the sub lists are concatenated by one pass. The author considered another divide and conquer solution for the k- selection problem. It

2121. nds the maximum of the

2122. rst k elements and the minimum of the rest. Denote them as x, and y. If x is smaller than y, it means that all the

2123. rst k elements are smaller than the rest, so that they are exactly the top k smallest; Otherwise, There are some elements in the

2124. rst k should be swapped. 1: procedure Tops(k;A) 2: l 1 3: u jAj 4: loop 5: i Max-At(A[l::k]) 6: j Min-At(A[k + 1::u]) 7: if A[i] A[j] then 8: break 9: Exchange A[l] $ A[j] 10: Exchange A[k + 1] $ A[i] 11: l Partition(A; l; k) 12: u Partition(A; k + 1; u) Explain why this algorithm works? What's the performance of it? Implement the binary search algorithm in both recursive and iterative manner, and try to verify your version automatically. You can either generate randomized data, test your program with the binary search in-variant, or compare with the built-in binary search tool in your standard library. Implement the improved saddleback search by

2125. rstly performing binary search to

2126. nd a more accurate solution domain in your favorite imperative programming language. Realize the improved 2D search, by performing binary search along the shorter center line, in your favorite imperative programming language. Someone considers that the 2D search can be designed as the following. When search a rectangle, as the minimum value is at bottom-left, and the maximum at to-right. If the target value is less than the minimum or greater than the maximum, then there is no solution; otherwise, the rectangle is divided into 4 sub rectangles at the center point, then perform recursively searching. 1: procedure Search(f; z; a; b; c; d) . (a; b): bottom-left (c; d): top-right 2: if z f(a; b) _ f(c; d) z then 3: if z = f(a; b) then 4: record (a; b) as a solution 5: if z = f(c; d) then

2127. 450 CHAPTER 14. SEARCHING 6: record (c; d) as a solution 7: return 8: p b a+c 2 c 9: q b b+d 2 c 10: Search(f; z; a; q; p; d) 11: Search(f; z; p; q; c; d) 12: Search(f; z; a; b; p; q) 13: Search(f; z; p; b; c; q) What's the performance of this algorithm? 14.2.2 Information reuse One interesting behavior that is that people learning while searching. We do not only remember lessons which cause search fails, but also learn patterns which lead to success. This is a kind of information reusing, no matter the information is positive or negative. However, It's not easy to determine what information should be kept. Too little information isn't enough to help eective searching, while keeping too much is expensive in term of spaces. In this section, we'll

2128. rst introduce two interesting problems, Boyer-Moore majority number problem and the maximum sum of sub vector problem. Both reuse information as little as possible. After that, two popular string matching algorithms, Knuth-Morris-Pratt algorithm and Boyer-Moore algorithm will be introduced. Boyer-Moore majority number Voting is quite critical to people. We use voting to choose the leader, make decision or reject a proposal. In the months when I was writing this chapter, there are three countries in the world voted their presidents. All of the three voting activities utilized computer to calculate the result. Suppose there is a country in a small island wants a new president. According to the constitution, only if the candidate wins more than half of the votes can be selected as the president. Given a serious of votes, such as A, B, A, C, B, B, D, ..., can we develop a program tells who is the new president if there is, or indicate nobody wins more than half of the votes? Of course this problem can be solved with brute-force by using a map. As what we did in the chapter of binary search tree4. templatetypename T T majority(const T xs, int n, T fail) { mapT, int m; int i, max = 0; T r; for (i = 0; i n; ++i) ++m[xs[i]]; for (typename mapT, int::iterator it = m.begin(); it != m.end(); ++it) if (it!second max) { max = it!second; 4There is a probabilistic sub-linear space counting algorithm published in 2004, named as `Count-min sketch'[8].

2129. 14.2. SEQUENCE SEARCH 451 r = it!first; } return max 2 n ? r : fail; } This program

2130. rst scan the votes, and accumulates the number of votes for each individual with a map. After that, it traverse the map to

2131. nd the one with the most of votes. If the number is bigger than the half, the winner is found otherwise, it returns a value to indicate fail. The following pseudo code describes this algorithm. 1: function Majority(A) 2: M empty map 3: for 8a 2 A do 4: Put(M, a; 1+ Get(M; a)) 5: max 0, m NIL 6: for 8(k; v) 2 M do 7: if max v then 8: max v, m k 9: if max jAj50% then 10: return m 11: else 12: fail For m individuals and n votes, this program

2132. rstly takes about O(n logm) time to build the map if the map is implemented in self balanced tree (red-black tree for instance); or about O(n) time if the map is hash table based. However, the hash table needs more space. Next the program takes O(m) time to traverse the map, and

2133. nd the majority vote. The following table lists the time and space performance for dierent maps. map time space self-balanced tree O(n logm) O(m) hashing O(n) O(m) at least Boyer and Moore invented a cleaver algorithm in 1980, which can pick the majority element with only one scan if there is. Their algorithm only needs O(1) space [7]. The idea is to record the

2134. rst candidate as the winner so far, and mark him with 1 vote. During the scan process, if the winner being selected gets another vote, we just increase the vote counter; otherwise, it means somebody vote against this candidate, so the vote counter should be decreased by one. If the vote counter becomes zero, it means this candidate is voted out; We select the next candidate as the new winner and repeat the above scanning process. Suppose there is a series of votes: A, B, C, B, B, C, A, B, A, B, B, D, B. Below table illustrates the steps of this processing.

2135. 452 CHAPTER 14. SEARCHING winner count scan position A 1 A, B, C, B, B, C, A, B, A, B, B, D, B A 0 A, B, C, B, B, C, A, B, A, B, B, D, B C 1 A, B, C, B, B, C, A, B, A, B, B, D, B C 0 A, B, C, B, B, C, A, B, A, B, B, D, B B 1 A, B, C, B, B, C, A, B, A, B, B, D, B B 0 A, B, C, B, B, C, A, B, A, B, B, D, B A 1 A, B, C, B, B, C, A, B, A, B, B, D, B A 0 A, B, C, B, B, C, A, B, A, B, B, D, B A 1 A, B, C, B, B, C, A, B, A, B, B, D, B A 0 A, B, C, B, B, C, A, B, A, B, B, D, B B 1 A, B, C, B, B, C, A, B, A, B, B, D, B B 0 A, B, C, B, B, C, A, B, A, B, B, D, B B 1 A, B, C, B, B, C, A, B, A, B, B, D, B The key point is that, if there exits the majority greater than 50%, it can't be voted out by all the others. However, if there are not any candidates win more than half of the votes, the recorded `winner' is invalid. Thus it is necessary to perform a second round scan for veri

2136. cation. The following pseudo code illustrates this algorithm. 1: function Majority(A) 2: c 0 3: for i 1 to jAj do 4: if c = 0 then 5: x A[i] 6: if A[i] = x then 7: c c + 1 8: else 9: c c 1 10: return x If there is the majority element, this algorithm takes one pass to scan the votes. In every iteration, it either increases or decreases the counter according to the vote is support or against the current selection. If the counter becomes zero, it means the current selection is voted out. So the new one is selected as the updated candidate for further scan. The process is linear O(n) time, and the spaces needed are just two variables. One for recording the selected candidate so far, the other is for vote counting. Although this algorithm can

2137. nd the majority element if there is. it still picks an element even there isn't. The following modi

2138. ed algorithm veri

2139. es the

2140. nal result with another round of scan. 1: function Majority(A) 2: c 0 3: for i 1 to jAj do 4: if c = 0 then 5: x A[i] 6: if A[i] = x then 7: c c + 1 8: else 9: c c 1

2141. 14.2. SEQUENCE SEARCH 453 10: c 0 11: for i 1 to jAj do 12: if A[i] = x then 13: c c + 1 14: if c %50jAj then 15: return x 16: else 17: fail Even with this veri

2142. cation process, the algorithm is still bound to O(n) time, and the space needed is constant. The following ISO C++ program implements this algorithm 5. templatetypename T T majority(const T xs, int n, T fail) { T m; int i, c; for (i = 0, c = 0; i n; ++i) { if (!c) m = xs[i]; c += xs[i] == m ? 1 : -1; } for (i = 0, c = 0; i n; ++i, c += xs[i] == m); return c 2 n ? m : fail; } Boyer-Moore majority algorithm can also be realized in purely functional approach. Dierent from the imperative settings, which use variables to record and update information, accumulators are used to de

2143. ne the core algorithm. De

2144. ne function maj(c; n;L), which takes a list of votes L, a selected candidate c so far, and a counter n. For non empty list L, we initialize c as the

2145. rst vote l1, and set the counter as 1 to start the algorithm: maj(l1; 1;L0), where L0 is the rest votes except for l1. Below are the de

2146. nition of this function. maj(c; n;L) = 8 : c : L = maj(c; n + 1;L0) : l1 = c maj(l1; 1;L0) : n = 0 ^ l16= c maj(c; n 1;L0) : otherwise (14.19) We also need to de

2147. ne a function, which can verify the result. The idea is that, if the list of votes is empty, the

2148. nal result is a failure; otherwise, we start the Boyer-Moore algorithm to

2149. nd a candidate c, then we scan the list again to count the total votes c wins, and verify if this number is not less than the half. majority(L) = 8 : fail : L = c : c = maj(l1; 1;L0); jfxjx 2 L; x = cgj %50jLj fail : otherwise (14.20) Below Haskell example code implements this algorithm. majority :: (Eq a) ) [a] ! Maybe a 5We actually uses the ANSI C style. The C++ template is only used to generalize the type of the element

2150. 454 CHAPTER 14. SEARCHING majority [] = Nothing majority (x:xs) = let m = maj x 1 xs in verify m (x:xs) maj c n [] = c maj c n (x:xs) j c == x = maj c (n+1) xs j n == 0 = maj x 1 xs j otherwise = maj c (n-1) xs verify m xs = if 2 (length $ filter (==m) xs) length xs then Just m else Nothing Maximum sum of sub vector Jon Bentley presents another interesting puzzle which can be solved by using quite similar idea in [4]. The problem is to

2151. nd the maximum sum of sub vector. For example in the following array, The sub vector f19, -12, 1, 9, 18g yields the biggest sum 35. 3 -13 19 -12 1 9 18 -16 15 -15 Note that it is only required to output the value of the maximum sum. If all the numbers are positive, the answer is de

2152. nitely the sum of all. Another special case is that all numbers are negative. We de

2153. ne the maximum sum is 0 for an empty sub vector. Of course we can

2154. nd the answer with brute-force, by calculating all sums of sub vectors and picking the maximum. Such naive method is typical quadratic. 1: function Max-Sum(A) 2: m 0 3: for i 1 to jAj do 4: s 0 5: for j i to jAj do 6: s s + A[j] 7: m Max(m; s) 8: return m The brute force algorithm does not reuse any information in previous search. Similar with Boyer-Moore majority vote algorithm, we can record the maximum sum end to the position where we are scanning. Of course we also need record the biggest sum found so far. The following

2155. gure illustrates this idea and the invariant during scan. i ... max ... max end at i ... Figure 14.11: Invariant during scan. At any time when we scan to the i-th position, the max sum found so far is recorded as A. At the same time, we also record the biggest sum end at i as B. Note that A and B may not be the same, in fact, we always maintain B A. and when B becomes greater than A by adding with the next element, we update A

2156. 14.2. SEQUENCE SEARCH 455 with this new value. When B becomes negative, this happens when the next el-ement is a negative number, we reset it to 0. The following tables illustrated the steps when we scan the example vector f3;13; 19;12; 1; 9; 18;16; 15;15g. max sum max end at i list to be scan 0 0 f3;13; 19;12; 1; 9; 18;16; 15;15g 3 3 f13; 19;12; 1; 9; 18;16; 15;15g 3 0 f19;12; 1; 9; 18;16; 15;15g 19 19 f12; 1; 9; 18;16; 15;15g 19 7 f1; 9; 18;16; 15;15g 19 8 f9; 18;16; 15;15g 19 17 f18;16; 15;15g 35 35 f16; 15;15g 35 19 f15;15g 35 34 f15g 35 19 fg This algorithm can be described as below. 1: function Max-Sum(V ) 2: A 0;B 0 3: for i 1 to jV j do 4: B Max(B + V [i]; 0) 5: A Max(A;B) It is trivial to implement this linear time algorithm, that we skip the details here. This algorithm can also be de

2157. ned in functional approach. Instead of mu-tating variables, we use accumulator to record A and B. In order to search the maximum sum of list L, we call the below function with maxsum(0; 0;L). maxsum(A;B;L) = A : L = maxsum(A0;B0;L0) : otherwise (14.21) Where B0 = max(l1 + B; 0) A0 = max(A;B0) Below Haskell example code implements this algorithm. maxsum = msum 0 0 where msum a _ [] = a msum a b (x:xs) = let b' = max (x+b) 0 a' = max a b' in msum a' b' xs KMP String matching is another important type of searching. Almost all the software editors are equipped with tools to

2158. nd string in the text. In chapters about Trie, Patricia, and sux tree, we have introduced some powerful data structures which can help to search string. In this section, we introduce another two string matching algorithms all based on information reusing. Some programming environments provide built-in string search tools, how-ever, most of them are brute-force solution including `strstr' function in ANSI

2159. 456 CHAPTER 14. SEARCHING C standard library, `

2160. nd' in C++ standard template library, ìndexOf' in Java Development Kit etc. Figure 14.12 illustrate how such character-by-character comparison process works. a n y a n a n t h o u s a n a n y m f l o w e r T a n a n y m s P q (a) The oset s = 4, after matching q = 4 characters, the 5th mismatch. a n y a n a n t h o u s a n a n y m f l o w e r T a n a n y m s P q (b) Move s = 4 + 2 = 6, directly. Figure 14.12: Match ànanym' in àny ananthous ananym ower'. Suppose we search a pattern P in text T, as shown in

2161. gure 14.12 (a), at oset s = 4, the process examines every character in P and T to check if they are same. It successfully matches the

2162. rst 4 characters `anan'. However, the 5th character in the pattern string is `y'. It doesn't match the corresponding character in the text, which is `t'. At this stage, the brute-force solution terminates the attempt, increases s by one to 5, and restart the comparison between `ananym' and `nantho...'. Actually, we can increase s not only by one. This is because we have already known that the

2163. rst four characters `anan' have been matched, and the failure happens at the 5th position. Observe the two letters pre

2164. x `an' of the pattern string is also a sux of `anan' that we have matched so far. A more eective way is to shift s by two but not one, which is shown in

2165. gure 14.12 (b). By this means, we reused the information that 4 characters have been matched. This helps us to skip invalid positions as many as possible. Knuth, Morris and Pratt presented this idea in [9] and developed a novel string matching algorithm. This algorithm is later called as `KMP', which is consist of the three authors' initials. For the sake of brevity, we denote the

2166. rst k characters of text T as Tk. Which means Tk is the k-character pre

2167. x of T. The key point to shift s eectively is to

2168. nd a function of q, where q is the number of characters matched successfully. For instance, q is 4 in

2169. gure 14.12 (a), as the 5th character doesn't match. Consider what situation we can shift s more than 1. As shown in

2170. gure 14.13, if we can shift the pattern P ahead, there must exist k, so that the

2171. rst k characters are as same as the last k characters of Pq. In other words, the pre

2172. x Pk is sux of Pq.

2173. 14.2. SEQUENCE SEARCH 457 ... T[i] T[i+1] T[i+2] ... ... ... ... T[i+q-1] ... T P[1] P[2] ... P[j] P[j+1] ... P[q] ... s P[1] P[2] ... P[k] ... P P Figure 14.13: Pk is both pre

2174. x of Pq and sux of Pq. It's possible that there is no such a pre

2175. x that is the sux at the same time. If we treat empty string as both the pre

2176. x and the sux of any others, there must be at least one solution that k = 0. It's also quite possible that there are multiple k satisfy. To avoid missing any possible matching positions, we have to

2177. nd the biggest k. We can de

2178. ne a pre

2179. x function (q) which tells us where we can fallback if the (q + 1)-th character does not match [2]. (q) = maxfkjk q ^ Pk A Pqg (14.22) Where A is read as `is sux of'. For instance, A A B means A is sux of B. This function is used as the following. When we match pattern P against text T from oset s, If it fails after matching q characters, we next look up (q) to get a fallback q0, and retry to compare P[q0] with the previous unmatched character. Based on this idea, the core algorithm of KMP can be described as the following. 1: function KMP(T; P) 2: n jTj;m jPj 3: build pre

2180. x function from P 4: q 0 . How many characters have been matched so far. 5: for i 1 to n do 6: while q 0 ^ P[q + 1]6= T[i] do 7: q (q) 8: if P[q + 1] = T[i] then 9: q q + 1 10: if q = m then 11: found one solution at i m 12: q (q) . look for next solution Although the de

2181. nition of pre

2182. x function (q) is given in equation (14.22), realizing it blindly by

2183. nding the longest sux isn't eective. Actually we can use the idea of information reusing again to build the pre

2184. x function. The trivial edge case is that, the

2185. rst character doesn't match. In this case the longest pre

2186. x, which is also the sux is de

2187. nitely empty, so (1) = k = 0. We record the longest pre

2188. x as Pk. In this edge case Pk = P0 is the empty string. After that, when we scan at the q-th character in the pattern string P, we hold the invariant that the pre

2189. x function values (i) for i in f1; 2; :::; q 1g have already been recorded, and Pk is the longest pre

2190. x which is also the sux of Pq1. As shown in

2191. gure 14.14, if P[q] = P[k + 1], A bigger k than before

2192. 458 CHAPTER 14. SEARCHING is found, we can increase the maximum of k by one; otherwise, if they are not same, we can use (k) to fallback to a shorter pre

2193. x Pk0 where k0 = (k), and check if the next character after this new pre

2194. x is same as the q-th character. We need repeat this step until either k becomes zero (which means only empty string satis

2195. es), or the q-th character matches. P[1] P[2] ... P[k] P[k+1] ... P[q-1] P[q] ... ? P[1] P[2] ... P[k] P[k+1] ... Figure 14.14: Pk is sux of Pq1, P[q] and P[k + 1] are compared. Realizing this idea gives the KMP pre

2196. x building algorithm. 1: function Build-Prefix-Function(P) 2: m jPj; k 0 3: (1) 0 4: for q 2 to m do 5: while k 0 ^ P[q]6= P[k + 1] do 6: k (k) 7: if P[q] = P[k + 1] then 8: k k + 1 9: (q) k 10: return The following table lists the steps of building pre

2197. x function for pattern string `ananym'. Note that the k in the table actually means the maximum k satis

2198. es equation (14.22). q Pq k Pk 1 a 0 2 an 0 3 ana 1 a 4 anan 2 an 5 anany 0 6 ananym 0 Translating the KMP algorithm to Python gives the below example code. def kmp_match(w, p): n = len(w) m = len(p) fallback = fprefix(p) k = 0 # how many elements have been matched so far. res = [] for i in range(n): while k 0 and p[k] != w[i]: k = fallback[k] #fall back if p[k] == w[i]: k = k + 1 if k == m: res.append(i+1-m)

2199. 14.2. SEQUENCE SEARCH 459 k = fallback[k-1] # look for next return res def fprefix(p): m = len(p) t = [0]m # fallback table k = 0 for i in range(2, m): while k0 and p[i-1] != p[k]: k = t[k-1] #fallback if p[i-1] == p[k]: k = k + 1 t[i] = k return t The KMP algorithm builds the pre

2200. x function for the pattern string as a kind of pre-processing before the search. Because of this, it can reuse as much information of the previous matching as possible. The amortized performance of building the pre

2201. x function is O(m). This can be proved by using potential method as in [2]. Using the similar method, it can be proved that the matching algorithm itself is also linear. Thus the total performance is O(m+n) at the expense of the O(m) space to record the pre

2202. x function table. It seems that varies pattern string would aect the performance of KMP. Considering the case that we are

2203. nding pattern string `aaa...a' of length m in a string `aaa...a' of length n. All the characters are same, when the last character in the pattern is examined, we can only fallback by 1, and this 1 character fallback repeats until it falls back to zero. Even in this extreme case, KMP algorithm still holds its linear performance (why?). Please try to consider more cases such as P = aaaa:::b, T = aaaa:::a and so on. Purely functional KMP algorithm It is not easy to realize KMP matching algorithm in purely functional man-ner. The imperative algorithm represented so far intensely uses array to record pre

2204. x function values. Although it is possible to utilize sequence like struc-ture in purely functional settings, such sequence is typically implemented with

2205. nger tree. Unlike native arrays,

2206. nger tree needs logarithm time for random accessing6. Richard Bird presents a formal program deduction to KMP algorithm by using fold fusion law in chapter 17 of [1]. In this section, we show how to develop purely functional KMP algorithm step by step from a brute-force pre

2207. x function creation method. Both text string and pattern are represented as singly linked-list in purely functional settings. During the scan process, these two lists are further parti-tioned, every one is broken into two parts. As shown in

2208. gure 14.15, The

2209. rst j characters in the pattern string have been matched. T[i+1] and P[j+1] will be compared next. If they are same, we need append the character to the matched 6Again, we don't use native array, even it is supported in some functional programming environments like Haskell.

2210. 460 CHAPTER 14. SEARCHING part. However, since strings are essentially singly linked list, such appending is proportion to j. T[1] T[2] ... ... T[i-1] T[i] T[i+1] T[i+2] ... ... T[n-1] T[n] P[1] P[2] ... P[j] s ? P[j+1] P[j+2] ... P[m] T P Figure 14.15: Pk is sux of Pq1, P[q] and P[k + 1] are compared. Denote the

2211. rst i characters as Tp, which means the pre

2212. x of T, the rest characters as Ts for sux; Similarly, the

2213. rst j characters as Pp, and the rest as Ps; Denote the

2214. rst character of Ts as t, the

2215. rst character of Ps as p. We have the following `cons' relationship. Ts = cons(t; T0 s) Ps = cons(p; P0 s) If t = p, note the following updating process is bound to linear time. T0 p = Tp [ ftg P0 p = Pp [ fpg We've introduced a method in the chapter about purely functional queue, which can solve this problem. By using a pair of front and rear list, we can turn the linear time appending to constant time linking. The key point is to represent the pre

2216. x part in reverse order. T = Tp [ Ts = reverse(reverse(Tp)) [ Ts = reverse( Tp) [ Ts Pp) [ Ps P = Pp [ Ps = reverse(reverse(Pp)) [ Ps = reverse( (14.23) Tp; Ts) and ( The idea is to using pair ( Pp; Ps) instead. With this change, the if t = p, we can update the pre

2217. x part fast in constant time. T0 p = cons(t; Tp) P0 p = cons(p; Pp) (14.24) The KMP matching algorithm starts by initializing the success pre

2218. x parts to empty strings as the following. search(P; T) = kmp(; (; P)(; T)) (14.25) Where is the pre

2219. x function we explained before. The core part of KMP

2220. 14.2. SEQUENCE SEARCH 461 algorithm, except for the pre

2221. x function building, can be de

2222. ned as below. kmp(; ( Pp; Ps); ( Tp; Ts)) = 8 : Tpjg : Ps = ^ Ts = fj : Ps6= ^ Ts = Tpg [ kmp(; ( fj Pp; Ps); ( Tp; Ts)) : Ps = ^ Ts6= kmp(; ( P0 p; P0 s); ( T0 p; T0 s)) : t = p kmp(; ( Pp; Ps); ( T0 p; T0 s)) : t6= p ^ Pp = kmp(; ( Pp; Ps); ( Tp; Ts)) : t6= p ^ Pp6= (14.26) The

2223. rst clause states that, if the scan successfully ends to both the pattern and text strings, we get a solution, and the algorithm terminates. Note that we use the right position in the text string as the matching point. It's easy to use the left position by subtracting with the length of the pattern string. For sake of brevity, we switch to right position in functional solutions. The second clause states that if the scan arrives at the end of text string, while there are still rest of characters in the pattern string haven't been matched, there is no solution. And the algorithm terminates. The third clause states that, if all the characters in the pattern string have been successfully matched, while there are still characters in the text haven't been examined, we get a solution, and we fallback by calling pre

2224. x function to go on searching other solutions. The fourth clause deals with the case, that the next character in pattern string and text are same. In such case, the algorithm advances one character ahead, and recursively performs searching. If the the next characters are not same and this is the

2225. rst character in the pattern string, we just need advance to next character in the text, and try again. Otherwise if this isn't the

2226. rst character in the pattern, we call pre

2227. x function to fallback, and try again. The brute-force way to build the pre

2228. x function is just to follow the de

2229. nition equation (14.22). Pp; Ps) = ( ( P 0 p; P 0 s) (14.27) where P0 p = longest(fsjs 2 prefixes(Pp); s A Ppg) P0 s = P P0 p Every time when calculate the fallback position, the algorithm naively enu-merates all pre

2230. xes of Pp, checks if it is also the sux of Pp, and then pick the longest one as result. Note that we reuse the subtraction symbol here for list dier operation. There is a tricky case which should be avoided. Because any string itself is both its pre

2231. x and sux. Say Pp @ Pp and Pp A Pp. We shouldn't enumerate Pp as a candidate pre

2232. x. One solution of such pre

2233. x enumeration can be realized as the following. prefixes(L) = fg : L = _ jLj = 1 cons(; map(s cons(l1; s); prefixes(L0))) : otherwise (14.28)

2234. 462 CHAPTER 14. SEARCHING Below Haskell example program implements this version of string matching algorithm. kmpSearch1 ptn text = kmpSearch' next ([], ptn) ([], text) kmpSearch' _ (sp, []) (sw, []) = [length sw] kmpSearch' _ _ (_, []) = [] kmpSearch' f (sp, []) (sw, ws) = length sw : kmpSearch' f (f sp []) (sw, ws) kmpSearch' f (sp, (p:ps)) (sw, (w:ws)) j p == w = kmpSearch' f ((p:sp), ps) ((w:sw), ws) j otherwise = if sp ==[] then kmpSearch' f (sp, (p:ps)) ((w:sw), ws) else kmpSearch' f (f sp (p:ps)) (sw, (w:ws)) next sp ps = (sp', ps') where prev = reverse sp prefix = longest [xs j xs inits prev, xs `isSuffixOf` prev] sp' = reverse prefix ps' = (prev ++ ps) prefix longest = maximumBy (compare `on` length) inits [] = [[]] inits [_] = [[]] inits (x:xs) = [] : (map (x:) $ inits xs) This version does not only perform poorly, but it is also complex. We can simplify it a bit. Observing the KMP matching is a scan process from left to the right of the text, it can be represented with folding (refer to Appendix A for detail). Firstly, we can augment each character with an index for folding like below. zip(T; f1; 2; :::g) (14.29) Zipping the text string with in

2235. nity natural numbers gives list of pairs. For example, text string `The quick brown fox jumps over the lazy dog' turns into (T, 1), (h, 2), (e, 3), ... (o, 42), (g, 43). The initial state for folding contains two parts, one is the pair of pattern (Pp; Ps), with pre

2236. x starts from empty, and the sux is the whole pattern string (; P). For illustration purpose only, we revert back to normal pairs but not ( Pp; Ps) notation. It can be easily replaced with reversed form in the

2237. nalized version. This is left as exercise to the reader. The other part is a list of positions, where the successful matching are found. It starts from empty list. After the folding

2238. nishes, this list contains all solutions. What we need is to extract this list from the

2239. nal state. The core KMP search algorithm is simpli

2240. ed like this. kmp(P; T) = snd(fold(search; ((; P); ); zip(T; f1; 2; :::g))) (14.30) The only `black box' is the search function, which takes a state, and a pair of character and index, and it returns a new state as result. Denote the

2241. rst character in Ps as p and the rest characters as P0 s (Ps = cons(p; P0 s)), we have

2242. 14.2. SEQUENCE SEARCH 463 the following de

2243. nition. search(((Pp; Ps);L); (c; i)) = 8 : ((Pp [ p; P0 s);L [ fig) : p = c ^ P0 s = ((Pp [ p; P0 s);L) : p = c ^ P0 s 6= ((Pp; Ps);L) : Pp = search(((Pp; Ps);L); (c; i)) : otherwise (14.31) If the

2244. rst character in Ps matches the current character c during scan, we need further check if all the characters in the pattern have been examined, if so, we successfully

2245. nd a solution, This position i in list L is recorded; Otherwise, we advance one character ahead and go on. If p does not match c, we need fallback for further retry. However, there is an edge case that we can't fallback any more. Pp is empty in this case, and we need do nothing but keep the current state. The pre

2246. x-function developed so far can also be improved a bit. Since we want to

2247. nd the longest pre

2248. x of Pp, which is also sux of it, we can scan from right to left instead. For any non empty list L, denote the

2249. rst element as l1, and all the rest except for the

2250. rst one as L0, de

2251. ne a function init(L), which returns all the elements except for the last one as below. init(L) = : jLj = 1 cons(l1; init(L0)) : otherwise (14.32) Note that this function can not handle empty list. The idea of scan from right to left for Pp is

2252. rst check if init(Pp) A Pp, if yes, then we are done; otherwise, we examine if init(init(Pp)) is OK, and repeat this till the left most. Based on this idea, the pre

2253. x-function can be modi

2254. ed as the following. (Pp; Ps) = (Pp; Ps) : Pp = fallback(init(Pp); cons(last(Pp); Ps)) : otherwise (14.33) Where fallback(A;B) = (A;B) : A A Pp (init(A); cons(last(A);B)) : otherwise (14.34) Note that fallback always terminates because empty string is sux of any string. The last(L) function returns the last element of a list, it is also a linear time operation (refer to Appendix A for detail). However, it's constant operation if we use Pp approach. This improved pre

2255. x-function is bound to linear time. It is still quite slower than the imperative algorithm which can look up pre

2256. x-function in constant O(1) time. The following Haskell example program implements this minor improvement. failure ([], ys) = ([], ys) failure (xs, ys) = fallback (init xs) (last xs:ys) where fallback as bs j as `isSuffixOf` xs = (as, bs) j otherwise = fallback (init as) (last as:bs) kmpSearch ws txt = snd $ foldl f (([], ws), []) (zip txt [1..]) where

2257. 464 CHAPTER 14. SEARCHING f (p@(xs, (y:ys)), ns) (x, n) j x == y = if ys==[] then ((xs++[y], ys), ns++[n]) else ((xs++[y], ys), ns) j xs == [] = (p, ns) j otherwise = f (failure p, ns) (x, n) f (p, ns) e = f (failure p, ns) e The bottleneck is that we can not use native array to record pre

2258. x functions in purely functional settings. In fact the pre

2259. x function can be understood as a state transform function. It transfer from one state to the other according to the matching is success or fail. We can abstract such state changing as a tree. In environment supporting algebraic data type, Haskell for example, such state tree can be de

2260. ned like below. data State a = E j S a (State a) (State a) A state is either empty, or contains three parts: the current state, the new state if match fails, and the new state if match succeeds. Such de

2261. nition is quite similar to the binary tree. We can call it `left-fail, right-success' tree. The state we are using here is (Pp; Ps). Similar as imperative KMP algorithm, which builds the pre

2262. x function from the pattern string, the state transforming tree can also be built from the pattern. The idea is to build the tree from the very beginning state (; P), with both its children empty. We replace the left child with a new state by calling function de

2263. ned above, and replace the right child by advancing one character ahead. There is an edge case, that when the state transfers to (P; ), we can not advance any more in success case, such node only contains child for failure case. The build function is de

2264. ned as the following. build((Pp; Ps);; ) = build((Pp; Ps);; ) : Ps = build((Pp; Ps); L;R) : otherwise (14.35) Where L = build((Pp; Ps);; ) R = build((Ps [ fpg; P0 s);; )) The meaning of p and P0 s are as same as before, that p is the

2265. rst character in Ps, and P0 s is the rest characters. The most interesting point is that the build function will never stop. It endless build a in

2266. nite tree. In strict programming environment, calling this function will freeze. However, in environments support lazy evaluation, only the nodes have to be used will be created. For example, both Haskell and Scheme/Lisp are capable to construct such in

2267. nite state tree. In imperative settings, it is typically realized by using pointers which links to ancestor of a node. Figure 14.16 illustrates such an in

2268. nite state tree for pattern string `ananym'. Note that the right most edge represents the case that the matching continuously succeed for all characters. After that, since we can't match any more, so the right sub-tree is empty. Base on this fact, we can de

2269. ne a auxiliary function to test if a state indicates the whole pattern is successfully matched. match((Pp; Ps); L;R) = True : Ps = False : otherwise (14.36)

2270. 14.2. SEQUENCE SEARCH 465 (’’, ananym) fail (’’, ananym) match (a, nanym) (’’, ananym) fail match (a, ananym) fail (’’, ananym) match (an, anym) fail ... ... ... (’’, ananym) match (ana, nym) fail ... (a, nanym) match (anan, ym) fail ... (an, anym) match (anany, m) fail (’’, ananym) match (ananym, ’’) fail (’’, ananym) empty Figure 14.16: The in

2271. nite state tree for pattern `ananym'.

2272. 466 CHAPTER 14. SEARCHING With the help of state transform tree, we can realize KMP algorithm in an automaton manner. kmp(P; T) = snd(fold(search; (Tr; []); zip(T; f1; 2; :::g))) (14.37) Where the tree Tr = build((; P);; ) is the in

2273. nite state transform tree. Function search utilizes this tree to transform the state according to match or fail. Denote the

2274. rst character in Ps as p, the rest characters as P0 s, and the matched positions found so far as A. search((((Pp; Ps); L;R);A); (c; i)) = 8 : (R;A [ fig) : p = c ^ match(R) (R;A) : p = c ^ :match(R) ((((Pp; Ps); L;R);A) : Pp = search((L;A); (c; i)) : otherwise (14.38) The following Haskell example program implements this algorithm. data State a = E j S a (State a) (State a) -- state, ok-state, fail-state deriving (Eq, Show) build :: (Eq a))State ([a], [a]) ! State ([a], [a]) build (S s@(xs, []) E E) = S s (build (S (failure s) E E)) E build (S s@(xs, (y:ys)) E E) = S s l r where l = build (S (failure s) E E) -- fail state r = build (S (xs++[y], ys) E E) matched (S (_, []) _ _) = True matched _ = False kmpSearch3 :: (Eq a) ) [a] ! [a] ! [Int] kmpSearch3 ws txt = snd $ foldl f (auto, []) (zip txt [1..]) where auto = build (S ([], ws) E E) f (s@(S (xs, ys) l r), ns) (x, n) j [x] `isPrefixOf` ys = if matched r then (r, ns++[n]) else (r, ns) j xs == [] = (s, ns) j otherwise = f (l, ns) (x, n) The bottle-neck is that the state tree building function calls to fallback. While current de

2275. nition of isn't eective enough, because it enumerates all candidates from right to the left every time. Since the state tree is in

2276. nite, we can adopt some common treatment for in

2277. nite structures. One good example is the Fibonacci series. The

2278. rst two Fibonacci numbers are de

2279. ned as 0 and 1; the rest Fibonacci numbers can be obtained by adding the previous two numbers. F0 = 0 F1 = 1 Fn = Fn1 + Fn2 (14.39)

2280. 14.2. SEQUENCE SEARCH 467 Thus the Fibonacci numbers can be list one by one as the following F0 = 0 F1 = 1 F2 = F1 + F0 F3 = F2 + F1 ::: (14.40) We can collect all numbers in both sides, and de

2281. ne F = f0; 1; F1; F2; :::g, Thus we have the following equation. F = f0; 1; F1 + F0; F2 + F1; :::g = f0; 1g [ fx + yjx 2 fF0; F1; F2; :::g; y 2 fF1; F2; F3; :::gg = f0; 1g [ fx + yjx 2 F; y 2 F0g (14.41) Where F0 = tail(F) is all the Fibonacci numbers except for the

2282. rst one. In environments support lazy evaluation, like Haskell for instance, this de

2283. nition can be expressed like below. fibs = 0 : 1 : zipWith (+) fibs (tail fibs) The recursive de

2284. nition for in

2285. nite Fibonacci series indicates an idea which can be used to get rid of the fallback function . Denote the state transfer tree as T, we can de

2286. ne the transfer function when matching a character on this tree as the following. trans(T; c) = 8 : root : T = R : T = ((Pp; Ps); L;R); c = p trans(L; c) : otherwise (14.42) If we match a character against empty node, we transfer to the root of the tree. We'll de

2287. ne the root later soon. Otherwise, we compare if the character c is as same as the

2288. rst character p in Ps. If they match, then we transfer to the right sub tree for this success case; otherwise, we transfer to the left sub tree for fail case. With transfer function de

2289. ned, we can modify the previous tree building function accordingly. This is quite similar to the previous Fibonacci series def-inition. build(T; (Pp; Ps)) = ((Pp; Ps); T; build(trans(T; p); (Pp [ fpg; P 0 s))) The right hand of this equation contains three parts. The

2290. rst one is the state that we are matching (Pp; Ps); If the match fails, Since T itself can handle any fail case, we use it directly as the left sub tree; otherwise we recursive build the right sub tree for success case by advancing one character ahead, and calling transfer function we de

2291. ned above. However, there is an edge case which has to be handled specially, that if Ps is empty, which indicates a successful match. As de

2292. ned above, there isn't right sub tree any more. Combining these cases gives the

2293. nal building function. build(T; (Pp; Ps)) = ((Pp; Ps); T; ) : Ps = ((Pp; Ps); T; build(trans(T; p); (Pp [ fpg; P0 s))) : otherwise (14.43)

2294. 468 CHAPTER 14. SEARCHING The last brick is to de

2295. ne the root of the in

2296. nite state transfer tree, which initializes the building. root = build(; (; P)) (14.44) And the new KMP matching algorithm is modi

2297. ed with this root. kmp(P; T) = snd(fold(trans; (root; []); zip(T; f1; 2; :::g))) (14.45) The following Haskell example program implements this

2298. nal version. kmpSearch ws txt = snd $ foldl tr (root, []) (zip txt [1..]) where root = build' E ([], ws) build' fails (xs, []) = S (xs, []) fails E build' fails s@(xs, (y:ys)) = S s fails succs where succs = build' (fst (tr (fails, []) (y, 0))) (xs++[y], ys) tr (E, ns) _ = (root, ns) tr ((S (xs, ys) fails succs), ns) (x, n) j [x] `isPrefixOf` ys = if matched succs then (succs, ns++[n]) else (succs, ns) j otherwise = tr (fails, ns) (x, n) Figure 14.17 shows the

2299. rst 4 steps when search `anaym' in text 'anal'. Since the

2300. rst 3 steps all succeed, so the left sub trees of these 3 states are not actually constructed. They are marked as `?'. In the fourth step, the match fails, thus the right sub tree needn't be built. On the other hand, we must construct the left sub tree, which is on top of the result of trans(right(right(right(T))); n), where function right(T) returns the right sub tree of T. This can be further expanded according to the de

2301. nition of building and state transforming functions till we get the concrete state ((a; nanym); L;R). The detailed deduce process is left as exercise to the reader. (’’, ananym) ? fail match (a, nanym) ? fail match (an, anym) ? fail match (ana, nym) fail (a, nanym) match ? Figure 14.17: On demand construct the state transform tree when searching `ananym' in text `anal'. This algorithm depends on the lazy evaluation critically. All the states to be transferred are built on demand. So that the building process is amortized

2302. 14.2. SEQUENCE SEARCH 469 O(m), and the total performance is amortized O(n + m). Readers can refer to [1] for detailed proof of it. It's worth of comparing the

2303. nal purely functional and the imperative algo-rithms. In many cases, we have expressive functional realization, however, for KMP matching algorithm, the imperative approach is much simpler and more intuitive. This is because we have to mimic the raw array by a in

2304. nite state transfer tree. Boyer-Moore Boyer-Moore string matching algorithm is another eective solution invited in 1977 [10]. The idea of Boyer-Moore algorithm comes from the following obser-vation. The bad character heuristics When attempt to match the pattern, even if there are several characters from the left are same, it fails if the last one does not match, as shown in

2305. gure 14.18. What's more, we wouldn't

2306. nd a match even if we slide the pattern down by 1, or 2. Actually, the length of the pattern `ananym' is 5, the last character is `m', however, the corresponding character in the text is `h'. It does not appear in the pattern at all. We can directly slide the pattern down by 5. a n y a n a n t h o u s a n a n y m f l o w e r T a n a n y m s P q Figure 14.18: Since character `h' doesn't appear in the pattern, we wouldn't

2307. nd a match even if we slide the pattern down less than the length of the pattern. This leads to the bad-character rule. We can do a pre-processing for the pattern. If the character set of the text is already known, we can

2308. nd all characters which don't appear in the pattern string. During the later scan process, as long as we

2309. nd such a bad character, we can immediately slide the pattern down by its length. The question is what if the unmatched character does appear in the pattern? While, in order not to miss any potential matches, we have to slide down the pattern to check again. This is shown as in the

2310. gure 14.19 It's quite possible that the unmatched character appears in the pattern more than one position. Denote the length of the pattern as jPj, the character appears in positions p1; p2; :::; pi. In such case, we take the right most one to avoid missing any matches. s = jPj pi (14.46) Note that the shifting length is 0 for the last position in the pattern according to the above equation. Thus we can skip it in realization. Another important

2311. 470 CHAPTER 14. SEARCHING i s s i m p l e ... e x a m p l e T P (a) The last character in the pattern `e' doesn't match `p'. However, `p' appears in the pattern. i s s i m p l e ... e x a m p l e T P (b) We have to slide the pattern down by 2 to check again. Figure 14.19: Slide the pattern if the unmatched character appears in the pat-tern. point is that since the shifting length is calculated against the position aligned with the last character in the pattern string, (we deduce it from jPj), no matter where the mismatching happens when we scan from right to the left, we slide down the pattern string by looking up the bad character table with the one in the text aligned with the last character of the pattern. This is shown in

2312. gure 14.20. i s s i m p l e ... e x a m p l e T P (a) i s s i m p l e ... T e x a m p l e P (b) Figure 14.20: Even the mismatching happens in the middle, between char ì' and à', we look up the shifting value with character è', which is 6 (calculated from the

2313. rst `e', the second `e' is skipped to avoid zero shifting). There is a good result in practice, that only using the bad-character rule leads to a simple and fast string matching algorithm, called Boyer-Moore-Horspool algorithm [11]. 1: procedure Boyer-Moore-Horspool(T; P) 2: for 8c 2 do 3: [c] jPj 4: for i 1 to jPj 1 do . Skip the last position 5: [P[i]] jPj i 6: s 0 7: while s + jPj jTj do

2314. 14.2. SEQUENCE SEARCH 471 8: i jPj 9: while i 1 ^ P[i] = T[s + i] do . scan from right 10: i i 1 11: if i 1 then 12: found one solution at s 13: s s + 1 . go on

2315. nding the next 14: else 15: s s + [T[s + jPj]] The character set is denoted as , we

2316. rst initialize all the values of sliding table as the length of the pattern string jPj. After that we process the pattern from left to right, update the sliding value. If a character appears multiple times in the pattern, the latter value, which is on the right hand, will overwrite the previous value. We start the matching scan process by aligning the pattern and the text string from the very left. However, for every alignment s, we scan from the right to the left until either there is unmatched character or all the characters in the pattern have been examined. The latter case indicates that we've found a match; while for the former case, we look up to slide the pattern down to the right. The following example Python code implements this algorithm accordingly. def bmh_match(w, p): n = len(w) m = len(p) tab = [m for _ in range(256)] # table to hold the bad character rule. for i in range(m-1): tab[ord(p[i])] = m - 1 - i res = [] offset = 0 while offset + m n: i = m - 1 while i 0 and p[i] == w[offset+i]: i = i - 1 if i 0: res.append(offset) offset = offset + 1 else: offset = offset + tab[ord(w[offset + m - 1])] return res The algorithm

2317. rstly takes about O(jj+jPj) time to build the sliding table. If the character set size is small, the performance is dominated by the pattern and the text. There is de

2318. nitely the worst case that all the characters in the pattern and text are same, e.g. searching àa...a' (m of à', denoted as am) in text àa......a' (n of à', denoted as an). The performance in the worst case is O(mn). This algorithm performs well if the pattern is long, and there are constant number of matching. The result is bound to linear time. This is as same as the best case of full Boyer-Moore algorithm which will be explained next.

2319. 472 CHAPTER 14. SEARCHING The good sux heuristics Consider searching pattern `abbabab' in text `bbbababbabab...' like

2320. gure 14.21. By using the bad-character rule, the pattern will be slided by two. b b b a b a b b a b a b ... X a b b a b a b T P (a) b b b a b a b b a b a b ... a b b a b a b T P (b) Figure 14.21: According to the bad-character rule, the pattern is slided by 2, so that the next `b' is aligned. Actually, we can do better than this. Observing that before the unmatched point, we have already successfully matched 6 characters `bbabab' from right to the left. Since `ab', which is the pre

2321. x of the pattern is also the sux of what we matched so far, we can directly slide the pattern to align this sux as shown in

2322. gure 14.22. b b b a b a b b a b a b ... a b b a b a b T P Figure 14.22: As the pre

2323. x `ab' is also the sux of what we've matched, we can slide down the pattern to a position so that `ab' are aligned. This is quite similar to the pre-processing of KMP algorithm, However, we can't always skip so many characters. Consider the following example as shown in

2324. gure 14.23. We have matched characters `bab' when the unmatch happens. Although the pre

2325. x `ab' of the pattern is also the sux of `bab', we can't slide the pattern so far. This is because `bab' appears somewhere else, which starts from the 3rd character of the pattern. In order not to miss any potential matching, we can only slide the pattern by two. b a a b b a b a b ... X a b b a b a b T P (a) b a a b b a b a b ... a b b a b a b T P (b) Figure 14.23: We've matched `bab', which appears somewhere else in the pattern (from the 3rd to the 5th character). We can only slide down the pattern by 2 to avoid missing any potential matching.

2326. 14.2. SEQUENCE SEARCH 473 The above situation forms the two cases of the good-sux rule, as shown in

2327. gure 14.24. (a) Case 1, Only a part of the matching sux occurs as a pre

2328. x of the pattern. (b) Case 2, The matching sux occurs some where else in the pattern. Figure 14.24: The light gray section in the text represents the characters have been matched; The dark gray parts indicate the same content in the pattern. Both cases in good sux rule handle the situation that there are multiple characters have been matched from right. We can slide the pattern to the right if any of the the following happens. Case 1 states that if a part of the matching sux occurs as a pre

2329. x of the pattern, and the matching sux doesn't appear in any other places in the pattern, we can slide the pattern to the right to make this pre

2330. x aligned; Case 2 states that if the matching sux occurs some where else in the pat-tern, we can slide the pattern to make the right most occurrence aligned. Note that in the scan process, we should apply case 2

2331. rst whenever it is possible, and then examine case 1 if the whole matched sux does not appears in the pattern. Observe that both cases of the good-sux rule only depend on the pattern string, a table can be built by pre-process the pattern for further looking up. For the sake of brevity, we denote the sux string from the i-th character of P as Pi. That Pi is the sub-string P[i]P[i + 1]:::P [m]. For case 1, we can check every sux of P, which includes Pm, Pm1, Pm2, ..., P2 to examine if it is the pre

2332. x of P. This can be achieved by a round of scan from right to the left. For case 2, we can check every pre

2333. x of P includes P1, P2, ..., Pm1 to examine if the longest sux is also a sux of P. This can be achieved by another round of scan from left to the right. 1: function Good-Suffix(P) 2: m jPj 3: s f0; 0; :::; 0g . Initialize the table of length m 4: l 0 . The last sux which is also pre

2334. x of P 5: for i m 1 down-to 1 do . First loop for case 1 6: if Pi @ P then . @ means `is pre

2335. x of' 7: l i

2336. 474 CHAPTER 14. SEARCHING 8: s[i] l 9: for i 1 to m do . Second loop for case 2 10: s Suffix-Length(Pi) 11: if s6= 0 ^ P[i s]6= P[m s] then 12: s[m s] m i 13: return s This algorithm builds the good-sux heuristics table s. It

2337. rst checks every sux of P from the shortest to the longest. If the sux Pi is also the pre

2338. x of P, we record this sux, and use it for all the entries until we

2339. nd another sux Pj , j i, and it is also the pre

2340. x of P. After that, the algorithm checks every pre

2341. x of P from the shortest to the longest. It calls the function Suffix-Length(Pi), to calculate the length of the longest sux of Pi, which is also sux of P. If this length s isn't zero, which means there exists a sub-string of s, that appears as the sux of the pattern. It indicates that case 2 happens. The algorithm overwrites the s-th entry from the right of the table s. Note that to avoid

2342. nding the same occurrence of the matched sux, we test if P[i s] and P[m s] are same. Function Suffix-Length is designed as the following. 1: function Suffix-Length(Pi) 2: m jPj 3: j 0 4: while P[m j] = P[i j] ^ j i do 5: j j + 1 6: return j The following Python example program implements the good-sux rule. def good_suffix(p): m = len(p) tab = [0 for _ in range(m)] last = 0 # first loop for case 1 for i in range(m-1, 0, -1): # m-1, m-2, .., 1 if is_prefix(p, i): last = i tab[i - 1] = last # second loop for case 2 for i in range(m): slen = suffix_len(p, i) if slen != 0 and p[i - slen] != p[m - 1 - slen]: tab[m - 1 - slen] = m - 1 - i return tab # test if p[i..m-1] `is prefix of` p def is_prefix(p, i): for j in range(len(p) - i): if p[j] != p [i+j]: return False return True # length of the longest suffix of p[..i], which is also a suffix of p def suffix_len(p, i):

2343. 14.2. SEQUENCE SEARCH 475 m = len(p) j = 0 while p[m - 1 - j] == p[i - j] and j i: j = j + 1 return j It's quite possible that both the bad-character rule and the good-sux rule can be applied when the unmatch happens. The Boyer-Moore algorithm com-pares and picks the bigger shift so that it can

2344. nd the solution as quick as possible. The bad-character rule table can be explicitly built as below 1: function Bad-Character(P) 2: for 8c 2 do 3: b[c] jPj 4: for i 1 to jPj 1 do 5: b[P[i]] jPj i 6: return b The following Python program implements the bad-character rule accord-ingly. def bad_char(p): m = len(p) tab = [m for _ in range(256)] for i in range(m-1): tab[ord(p[i])] = m - 1 - i return tab The

2345. nal Boyer-Moore algorithm

2346. rstly builds the two rules from the pattern, then aligns the pattern to the beginning of the text and scans from right to the left for every alignment. If any unmatch happens, it tries both rules, and slides the pattern with the bigger shift. 1: function Boyer-Moore(T; P) 2: n jTj;m jPj 3: b Bad-Character(P) 4: s Good-Suffix(P) 5: s 0 6: while s + m n do 7: i m 8: while i 1 ^ P[i] = T[s + i] do 9: i i 1 10: if i 1 then 11: found one solution at s 12: s s + 1 . go on

2347. nding the next 13: else 14: s s + max(b[T[s + m]]; s[i]) Here is the example implementation of Boyer-Moore algorithm in Python. def bm_match(w, p): n = len(w) m = len(p) tab1 = bad_char(p) tab2 = good_suffix(p) res = []

2348. 476 CHAPTER 14. SEARCHING offset = 0 while offset + m n: i = m - 1 while i 0 and p[i] == w[offset + i]: i = i - 1 if i 0: res.append(offset) offset = offset + 1 else: offset = offset + max(tab1[ord(w[offset + m - 1])], tab2[i]) return res The Boyer-Moore algorithm published in original paper is bound to O(n+m) in worst case only if the pattern doesn't appear in the text [10]. Knuth, Morris, and Pratt proved this fact in 1977 [12]. However, when the pattern appears in the text, as we shown above, Boyer-Moore performs O(nm) in the worst case. Richard birds shows a purely functional realization of Boyer-Moore algorithm in chapter 16 in [1]. We skipped it in this book. Exercise 14.2 Proof that Boyer-Moore majority vote algorithm is correct. Given a list,

2349. nd the element occurs most. Are there any divide and conqueror solutions? Are there any divide and conqueror data structures, such as map can be used? Bentley presents a divide and conquer algorithm to

2350. nd the maximum sum in O(n log n) time in [4]. The idea is to split the list at the middle point. We can recursively

2351. nd the maximum sum in the

2352. rst half and second half; However, we also need to

2353. nd maximum sum cross the middle point. The method is to scan from the middle point to both ends as the following. 1: function Max-Sum(A) 2: if A = then 3: return 0 4: else if jAj = 1 then 5: return Max(0;A[1]) 6: else 7: m b jAj 2 c 8: a Max-From(Reverse(A[1:::m])) 9: b Max-From(A[m + 1:::jAj]) 10: c Max-Sum(A[1:::m]) 11: d Max-Sum(A[m + 1:::jAj) 12: return Max(a + b; c; d) 13: function Max-From(A) 14: sum 0;m 0 15: for i 1 to jAj do 16: sum sum + A[i] 17: m Max(m; sum) 18: return m

2354. 14.3. SOLUTION SEARCHING 477 It's easy to deduce the time performance is T(n) = 2T(n=2) + O(n). Implement this algorithm in your favorite programming language. Explain why KMP algorithm perform in linear time even in the seemed `worst' case. Implement the purely functional KMP algorithm by using reversed Pp to avoid the linear time appending operation. Deduce the state of the tree left(right(right(right(T)))) when searching `ananym' in text `anal'. 14.3 Solution searching One interesting thing that computer programming can oer is solving puzzles. In the early phase of classic arti

2355. cial intelligent, people developed many methods to search for solutions. Dierent from the sequence searching and string match-ing, the solution doesn't obviously exist among a candidates set. It typically need construct the solution while trying varies of attempts. Some problems are solvable, while others are not. Among the solvable problems, not all of them just have one unique solution. For example, a maze may have multiple ways out. People sometimes need search for the best one. 14.3.1 DFS and BFS DFS and BFS stand for deep-

2356. rst search and breadth-

2357. rst search. They are typically introduced as graph algorithms in textbooks. Graph is a comprehen-sive topic which is hard to be covered in this elementary book. In this section, we'll show how to use DFS and BFS to solve some real puzzles without formal introduction about the graph concept. Maze Maze is a classic and popular puzzle. Maze is amazing to both kids and adults. Figure 14.25 shows an example maze. There are also real maze gardens can be found in parks for fun. In the late 1990s, maze-solving games were quite often hold in robot mouse competition all over the world. Figure 14.25: A maze

2358. 478 CHAPTER 14. SEARCHING There are multiple methods to solve maze puzzle. We'll introduce an eec-tive, yet not the best one in this section. There are some well known sayings about how to

2359. nd the way out in maze, while not all of them are true. For example, one method states that, wherever you have multiple ways, always turn right. This doesn't work as shown in

2360. gure 14.26. The obvious solution is

2361. rst to go along the top horizontal line, then turn right, and keep going ahead at the 'T' section. However, if we always turn right, we'll endless loop around the inner big block. Figure 14.26: It leads to loop way if always turns right. This example tells us that the decision when there are multiple choices mat-ters the solution. Like the fairy tale we read in our childhood, we can take some bread crumbs in a maze. When there are multiple ways, we can simply select one, left a piece of bread crumbs to mark this attempt. If we enter a died end, we go back to the last place where we've made a decision by back-tracking the bread crumbs. Then we can alter to another way. At any time, if we

2362. nd there have been already bread crumbs left, it means we have entered a loop, we must go back and try dierent ways. Repeat these try-and-check steps, we can either

2363. nd the way out, or give the `no solution' fact. In the later case, we back-track to the start point. One easy way to describe a maze, is by a m n matrix, each element is either 0 or 1, which indicates if there is a way at this cell. The maze illustrated in

2364. gure 14.26 can be de

2365. ned as the following matrix. 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 0 Given a start point s = (i; j), and a goal e = (p; q), we need

2366. nd all solutions, that are the paths from s to e. There is an obviously recursive exhaustive search method. That in order to

2367. nd all paths from s to e, we can check all connected points to s, for every such point k, we recursively

2368. nd all paths from k to e. This method can be illustrated as the following. Trivial case, if the start point s is as same as the target point e, we are done;

2369. 14.3. SOLUTION SEARCHING 479 Otherwise, for every connected point k to s, recursively

2370. nd the paths from k to e; If e can be reached via k, put section s-k in front of each path between k and e. However, we have to left 'bread crumbs' to avoid repeatedly trying the same attempts. This is because otherwise in the recursive case, we start from s,

2371. nd a connected point k, then we further try to

2372. nd paths from k to e. Since s is connected to k as well, so in the next recursion, we'll try to

2373. nd paths from s to e again. It turns to be the very same origin problem, and we are trapped in in

2374. nite recursions. Our solution is to initialize an empty list, use it to record all the points we've visited so far. For every connected point, we look up the list to examine if it has already been visited. We skip all the visited candidates and only try those new ones. The corresponding algorithm can be de

2375. ned like this. solveMaze(m; s; e) = solve(s; fg) (14.47) Where m is the matrix which de

2376. nes a maze, s is the start point, and e is the end point. Function solve is de

2377. ned in the context of solveMaze, so that the maze and the end point can be accessed. It can be realized recursively like what we described above7. solve(s; P) = 8 : ffsg [ pjp 2 Pg : s = e concat(f solve(s0; ffsg [ pjp 2 Pg)j s0 2 adj(s);:visited(s0)g) : otherwise (14.48) Note that P also serves as an accumulator. Every connected point is recorded in all the possible paths to the current position. But they are stored in reversed order, that is the newly visited point is put to the head of all the lists, and the starting point is the last one. This is because the appending operation is linear (O(n), where n is the number of elements stored in a list), while linking to the head is just constant time. We can output the result in correct order by reversing all possible solutions in equation (14.47)8: solveMaze(m; s; e) = map(reverse; solve(s; fg)) (14.49) We need de

2378. ne functions adj(p) and visited(p), which

2379. nds all the connected points to p, and tests if point p has been visited respectively. Two points are connected if and only if they are next cells horizontally or vertically in the maze matrix, and both have zero value. adj((x; y)) = f(x0; y0)j (x0; y0) 2 f(x 1; y); (x + 1; y); (x; y 1); (x; y + 1)g; 1 x0 M; 1 y0 N;mx0y0 = 0g (14.50) Where M and N are the widths and heights of the maze. Function visited(p) examines if point p has been recorded in any lists in P. visited(p; P) = 9path 2 P; p 2 path (14.51) 7Function concat can atten a list of lists. For example. concat(ffa; b; cg; fx; y; zgg) = fa; b; c; x; y; zg. Refer to appendix A for detail. 8the detailed de

2380. nition of reverse can be found in the appendix A.

2381. 480 CHAPTER 14. SEARCHING The following Haskell example code implements this algorithm. solveMaze m from to = map reverse $ solve from [[]] where solve p paths j p == to = map (p:) paths j otherwise = concat [solve p' (map (p:) paths) j p' adjacent p, not $ visited p' paths] adjacent (x, y) = [(x', y') j (x', y') [(x-1, y), (x+1, y), (x, y-1), (x, y+1)], inRange (bounds m) (x', y'), m ! (x', y') == 0] visited p paths = any (p `elem`) paths For a maze de

2382. ned as matrix like below example, all the solutions can be given by this program. mz = [[0, 0, 1, 0, 1, 1], [1, 0, 1, 0, 1, 1], [1, 0, 0, 0, 0, 0], [1, 1, 0, 1, 1, 1], [0, 0, 0, 0, 0, 0], [0, 0, 0, 1, 1, 0]] maze = listArray ((1,1), (6, 6)) concat solveMaze (maze mz) (1,1) (6, 6) As we mentioned, this is a style of 'exhaustive search'. It recursively searches all the connected points as candidates. In a real maze solving game, a robot mouse competition for instance, it's enough to just

2383. nd a route. We can adapt to a method close to what described at the beginning of this section. The robot mouse always tries the

2384. rst connected point, and skip the others until it gets stuck. We need some data structure to store the 'bread crumbs', which help to remember the decisions being made. As we always attempt to

2385. nd the way on top of the latest decision, it is the last-in,

2386. rst-out manner. A stack can be used to realize it. At the very beginning, only the starting point s is stored in the stack. we pop it out,

2387. nd, for example, points a, and b, are connected to s. We push the two possible paths: fa; sg and fb; sg to the stack. Next we pop fa; sg out, and examine connected points to a. Then all the paths with 3 steps will be pushed back. We repeat this process. At anytime, each element stored in the stack is a path, from the starting point to the farthest place can arrive in the reversed order. This can be illustrated in

2388. gure 14.27. The stack can be realized with a list. The latest option is picked from the head, and the new candidates are also added to the head. The maze puzzle can be solved by using such a list of paths: 0 (m; s; e) = reverse(solve solveMaze 0 (ffsgg)) (14.52) As we are searching the

2389. rst, but not all the solutions, map isn't used here. When the stack is empty, it means that we've tried all the options and failed to

2390. nd a way out. There is no solution; otherwise, the top option is popped, expanded with all the adjacent points which haven't been visited before, and pushed back to the stack. Denote the stack as S, if it isn't empty, the top

2391. 14.3. SOLUTION SEARCHING 481 [s] [a, s] [b, s] ... i j k p [p, ... , s] [q, ..., s] ... [i, p, ... , s] [j, p, ..., s] [k, p, ..., s] [q, ..., s] ... Figure 14.27: The stack is initialized with a singleton list of the starting point s. s is connected with point a and b. Paths fa; sg and fb; sg are pushed back. In some step, the path ended with point p is popped. p is connected with points i, j, and k. These 3 points are expanded as dierent options and pushed back to the stack. The candidate path ended with q won't be examined unless all the options above fail. element is s1, and the new stack after the top being popped as S0. s1 is a list of points represents path P. Denote the

2392. rst point in this path as p1, and the rest as P0. The solution can be formalized as the following. 0 (S) = solve 8 : : S = s1 : s1 = e solve0(S0) : C = fcjc 2 adj(p1); c62 P0g = solve0(ffpg [ Pjp 2 Cg [ S) : otherwise;C6= (14.53) Where the adj function is de

2393. ned above. This updated maze solution can be implemented with the below example Haskell program 9. dfsSolve m from to = reverse $ solve [[from]] where solve [] = [] solve (c@(p:path):cs) j p == to = c -- stop at the first solution j otherwise = let os = filter (`notElem` path) (adjacent p) in if os == [] then solve cs else solve ((map (:c) os) ++ cs) It's quite easy to modify this algorithm to

2394. nd all solutions. When we

2395. nd a path in the second clause, instead of returning it immediately, we record it and go on checking the rest memorized options in the stack till until the stack becomes empty. We left it as an exercise to the reader. The same idea can also be realized imperatively. We maintain a stack to store all possible paths from the starting point. In each iteration, the top option path is popped, if the farthest position is the end point, a solution is found; otherwise, all the adjacent, not visited yet points are appended as new paths and pushed 9The same code of adjacent function is skipped

2396. 482 CHAPTER 14. SEARCHING back to the stack. This is repeated till all the candidate paths in the stacks are checked. We use the same notation to represent the stack S. But the paths will be stored as arrays instead of list in imperative settings as the former is more eective. Because of this the starting point is the

2397. rst element in the path array, while the farthest reached place is the right most element. We use pn to represent Last(P) for path P. The imperative algorithm can be given as below. 1: function Solve-Maze(m; s; e) 2: S 3: Push(S; fsg) 4: L . the result list 5: while S6= do 6: P Pop(S) 7: if e = pn then 8: Add(L; P) 9: else 10: for 8p 2 Adjacent(m; pn) do 11: if p =2 P then 12: Push(S; P [ fpg) 13: return L The following example Python program implements this maze solving algo-rithm. def solve(m, src, dst): stack = [[src]] s = [] while stack != []: path = stack.pop() if path[-1] == dst: s.append(path) else: for p in adjacent(m, path[-1]): if not p in path: stack.append(path + [p]) return s def adjacent(m, p): (x, y) = p ds = [(0, 1), (0, -1), (1, 0), (-1, 0)] ps = [] for (dx, dy) in ds: x1 = x + dx y1 = y + dy if 0 x1 and x1 len(m[0]) and 0 y1 and y1 len(m) and m[y][x] == 0: ps.append((x1, y1)) return ps And the same maze example given above can be solved by this program like the following. mz = [[0, 0, 1, 0, 1, 1],

2398. 14.3. SOLUTION SEARCHING 483 [1, 0, 1, 0, 1, 1], [1, 0, 0, 0, 0, 0], [1, 1, 0, 1, 1, 1], [0, 0, 0, 0, 0, 0], [0, 0, 0, 1, 1, 0]] solve(mz, (0, 0), (5,5)) It seems that in the worst case, there are 4 options (up, down, left, and right) at each step, each option is pushed to the stack and eventually examined during backtracking. Thus the complexity is bound to O(4n). The actual time won't be so large because we

2399. ltered out the places which have been visited before. In the worst case, all the reachable points are visited exactly once. So the time is bound to O(n), where n is the number of points connected in total. As a stack is used to store candidate solutions, the space complexity is O(n2). Eight queens puzzle The eight queens puzzle is also a famous problem. Although cheese has very long history, this puzzle was

2400. rst published in 1848 by Max Bezzel[13]. Queen in the cheese game is quite powerful. It can attack any other pieces in the same row, column and diagonal at any distance. The puzzle is to

2401. nd a solution to put 8 queens in the board, so that none of them attack each other. Figure 14.28 (a) illustrates the places can be attacked by a queen and 14.28 (b) shows a solution of 8 queens puzzle. (a) A queen piece. (b) An example solution Figure 14.28: The eight queens puzzle. It's obviously that the puzzle can be solved by brute-force, which takes P8 64 times. This number is about 4 1010. It can be easily improved by observing that, no two queens can be in the same row, and each queen must be put on one column between 1 to 8. Thus we can represent the arrangement as a permutation of f1; 2; 3; 4; 5; 6; 7; 8g. For instance, the arrangement f6; 2; 7; 1; 3; 5; 8; 4g means, we put the

2402. rst queen at row 1, column 6, the second queen at row 2 column 2, ..., and the last queen at row 8, column 4. By this means, we need only examine 8! = 40320 possibilities. We can

2403. nd better solutions than this. Similar to the maze puzzle, we put queens one by one from the

2404. rst row. For the

2405. rst queen, there are 8 options, that we can put it at one of the eight columns. Then for the next queen, we

2406. 484 CHAPTER 14. SEARCHING again examine the 8 candidate columns. Some of them are not valid because those positions will be attacked by the

2407. rst queen. We repeat this process, for the i-th queen, we examine the 8 columns in row i,

2408. nd which columns are safe. If none column is valid, it means all the columns in this row will be attacked by some queen we've previously arranged, we have to backtrack as what we did in the maze puzzle. When all the 8 queens are successfully put to the board, we

2409. nd a solution. In order to

2410. nd all the possible solutions, we need record it and go on to examine other candidate columns and perform back tracking if necessary. This process terminates when all the columns in the

2411. rst row have been examined. The below equation starts the search. solve(fg; ) (14.54) In order to manage the candidate attempts, a stack S is used as same as in the maze puzzle. The stack is initialized with one empty element. And a list L is used to record all possible solutions. Denote the top element in the stack as s1. It's actually an intermediate state of assignment, which is a partial permutation of 1 to 8. after pops s1, the stack becomes S0. The solve function can be de

2412. ned as the following. solve(S;L) = 8 : L : S = solve(S0; fs1g [ L) : js1j = 8 8 solve( : fig [ s1j i 2 [1; 8]; i =2 s1; safe(i; s1) 9= ; [ S0;L) : otherwise (14.55) If the stack is empty, all the possible candidates have been examined, it's not possible to backtrack any more. L has been accumulated all found solutions and returned as the result; Otherwise, if the length of the top element in the stack is 8, a valid solution is found. We add it to L, and go on

2413. nding other solutions; If the length is less than 8, we need try to put the next queen. Among all the columns from 1 to 8, we pick those not already occupied by previous queens (through the i =2 s1 clause), and must not be attacked in diagonal direction (through the safe predication). The valid assignments will be pushed to the stack for the further searching. Function safe(x;C) detects if the assignment of a queen in position x will be attacked by other queens in C in diagonal direction. There are 2 possible cases, 45 and 135 directions. Since the row of this new queen is y = 1 + jCj, where jCj is the length of C, the safe function can be de

2414. ned as the following. safe(x;C) = 8(c; r) 2 zip(reverse(C); f1; 2; :::g); jx cj6= jy rj (14.56) Where zip takes two lists, and pairs every elements in them to a new list. Thus If C = fci1; ci2; :::; c2; c1g represents the column of the

2415. rst i 1 queens has been assigned, the above function will check list of pairs f(c1; 1); (c2; 2); :::; (ci1; i 1)g with position (x; y) forms any diagonal lines. Translating this algorithm into Haskell gives the below example program. solve = dfsSolve [[]] [] where dfsSolve [] s = s

2416. 14.3. SOLUTION SEARCHING 485 dfsSolve (c:cs) s j length c == 8 = dfsSolve cs (c:s) j otherwise = dfsSolve ([(x:c) j x [1..8] c, not $ attack x c] ++ cs) s attack x cs = let y = 1 + length cs in any ((c, r) ! abs(x - c) == abs(y - r)) $ zip (reverse cs) [1..] Observing that the algorithm is tail recursive, it's easy to transform it into imperative realization. Instead of using list, we use array to represent queens assignment. Denote the stack as S, and the possible solutions as A. The imperative algorithm can be described as the following. 1: function Solve-Queens 2: S fg 3: L . The result list 4: while S6= do 5: A Pop(S) . A is an intermediate assignment 6: if jAj = 8 then 7: Add(L;A) 8: else 9: for i 1 to 8 do 10: if Valid(i;A) then 11: Push(S;A [ fig) 12: return L The stack is initialized with the empty assignment. The main process re-peatedly pops the top candidate from the stack. If there are still queens left, the algorithm examines possible columns in the next row from 1 to 8. If a column is safe, that it won't be attacked by any previous queens, this column will be appended to the assignment, and pushed back to the stack. Dierent from the functional approach, since array, but not list, is used, we needn't reverse the solution assignment any more. Function Valid checks if column x is safe with previous queens put in A. It

2417. lters out the columns have already been occupied, and calculates if any diagonal lines are formed with existing queens. 1: function Valid(x;A) 2: y 1 + jAj 3: for i 1 to jAj do 4: if x = i _ jy ij = jx A[i]j then 5: return False 6: return True The following Python example program implements this imperative algo-rithm. def solve(): stack = [[]] s = [] while stack != []: a = stack.pop() if len(a) == 8: s.append(a) else:

2418. 486 CHAPTER 14. SEARCHING for i in range(1, 9): if valid(i, a): stack.append(a+[i]) return s def valid(x, a): y = len(a) + 1 for i in range(1, y): if x == a[i-1] or abs(y - i) == abs(x - a[i-1]): return False return True Although there are 8 optional columns for each queen, not all of them are valid and thus further expanded. Only those columns haven't been occupied by previous queens are tried. The algorithm only examines 15720, which is far less than 88 = 16777216, possibilities [13]. It's quite easy to extend the algorithm, so that it can solve N queens puzzle, where N 4. However, the time cost increases fast. The backtrack algorithm is just slightly better than the one permuting the sequence of 1 to 8 (which is bound to o(N!)). Another extension to the algorithm is based on the fact that the chess board is square, which is symmetric both vertically and horizontally. Thus a solution can generate other solutions by rotating and ipping. These aspects are left as exercises to the reader. Peg puzzle I once received a puzzle of the leap frogs. It said to be homework for 2nd grade student in China. As illustrated in

2419. gure 14.29, there are 6 frogs in 7 stones. Each frog can either hop to the next stone if it is not occupied, or leap over one frog to another empty stone. The frogs on the left side can only move to the right, while the ones on the right side can only move to the left. These rules are described in

2420. gure 14.30 Figure 14.29: The leap frogs puzzle. The goal of this puzzle is to arrange the frogs to jump according to the rules, so that the positions of the 3 frogs on the left are

2421. nally exchange with the ones on the right. If we denote the frog on the left as 'A', on the right as 'B', and the empty stone as 'O'. The puzzle is to

2422. nd a solution to transform from 'AAAOBBB' to 'BBBOAAA'. This puzzle is just a special form of the peg puzzles. The number of pegs is not limited to 6. it can be 8 or other bigger even numbers. Figure 14.31 shows some variants.

2423. 14.3. SOLUTION SEARCHING 487 (a) Jump to the next stone (b) Jump over to the right (c) Jump over to the left Figure 14.30: Moving rules. (a) Solitaire (b) Hop over (c) Draught board Figure 14.31: Variants of the peg puzzles from http://home.comcast.net/ stegmann/jumping.htm We can solve this puzzle by programing. The idea is similar to the 8 queens puzzle. Denote the positions from the left most stone as 1, 2, ..., 7. In ideal cases, there are 4 options to arrange the move. For example when start, the frog on 3rd stone can hop right to the empty stone; symmetrically, the frog on the 5th stone can hop left; Alternatively, the frog on the 2nd stone can leap right, while the frog on the 6th stone can leap left. We can record the state and try one of these 4 options at every step. Of course not all of them are possible at any time. If get stuck, we can backtrack and try other options. As we restrict the left side frogs only moving to the right, and the right frogs only moving to the left. The moves are not reversible. There won't be any repetition cases as what we have to deal with in the maze puzzle. However, we still need record the steps in order to print them out

2424. nally. In order to enforce these restriction, let A, O, B in representation 'AAAOBBB' be -1, 0, and 1 respectively. A state L is a list of elements, each element is one of these 3 values. It starts from f1;1;1; 0; 1; 1; 1g. L[i] access the i-th ele-ment, its value indicates if the i-th stone is empty, occupied by a frog from left side, or occupied by a frog from right side. Denote the position of the vacant stone as p. The 4 moving options can be stated as below. Leap left: p 6 and L[p + 2] 0, swap L[p] $ L[p + 2]; Hop left: p 7 and L[p + 1] 0, swap L[p] $ L[p + 1]; Leap right: p 2 and L[p 2] 0, swap L[p 2] $ L[p]; Hop right: p 1 and L[p 1] 0, swap L[p 1] $ L[p]. Four functions leapl(L); hopl(L); leapr(L) and hopr(L) are de

2425. ned accord-ingly. If the state L does not satisfy the move restriction, these function return L unchanged, otherwise, the changed state L0 is returned accordingly.

2426. 488 CHAPTER 14. SEARCHING We can also explicitly maintain a stack S to the attempts as well as the historic movements. The stack is initialized with a singleton list of starting state. The solution is accumulated to a list M, which is empty at the beginning: solve(ff1;1;1; 0; 1; 1; 1gg; ) (14.57) As far as the stack isn't empty, we pop one intermediate attempt. If the latest state is equal to f1; 1; 1; 0;1;1;1g, a solution is found. We append the series of moves till this state to the result list M; otherwise, We expand to next possible state by trying all four possible moves, and push them back to the stack for further search. Denote the top element in the stack S as s1, and the latest state in s1 as L. The algorithm can be de

2427. ned as the following. solve(S;M) = 8 : M : S = solve(S0; freverse(s1)g [M) : L = f1; 1; 1; 0;1;1;1g solve(P [ S0;M) : otherwise (14.58) Where P are possible moves from the latest state L: P = fL 0jL 0 2 fleapl(L); hopl(L); leapr(L); hopr(L)g;L6= L 0g Note that the starting state is stored as the last element, while the

2428. nal state is the

2429. rst. That is the reason why we reverse it when adding to solution list. Translating this algorithm to Haskell gives the following example program. solve = dfsSolve [[[-1, -1, -1, 0, 1, 1, 1]]] [] where dfsSolve [] s = s dfsSolve (c:cs) s j head c == [1, 1, 1, 0, -1, -1, -1] = dfsSolve cs (reverse c:s) j otherwise = dfsSolve ((map (:c) $ moves $ head c) ++ cs) s moves s = filter (==s) [leapLeft s, hopLeft s, leapRight s, hopRight s] where leapLeft [] = [] leapLeft (0:y:1:ys) = 1:y:0:ys leapLeft (y:ys) = y:leapLeft ys hopLeft [] = [] hopLeft (0:1:ys) = 1:0:ys hopLeft (y:ys) = y:hopLeft ys leapRight [] = [] leapRight (-1:y:0:ys) = 0:y:(-1):ys leapRight (y:ys) = y:leapRight ys hopRight [] = [] hopRight (-1:0:ys) = 0:(-1):ys hopRight (y:ys) = y:hopRight ys Running this program

2430. nds 2 symmetric solutions, each takes 15 steps. One solution is list in the below table.

2431. 14.3. SOLUTION SEARCHING 489 step -1 -1 -1 0 1 1 1 1 -1 -1 0 -1 1 1 1 2 -1 -1 1 -1 0 1 1 3 -1 -1 1 -1 1 0 1 4 -1 -1 1 0 1 -1 1 5 -1 0 1 -1 1 -1 1 6 0 -1 1 -1 1 -1 1 7 1 -1 0 -1 1 -1 1 8 1 -1 1 -1 0 -1 1 9 1 -1 1 -1 1 -1 0 10 1 -1 1 -1 1 0 -1 11 1 -1 1 0 1 -1 -1 12 1 0 1 -1 1 -1 -1 13 1 1 0 -1 1 -1 -1 14 1 1 1 -1 0 -1 -1 15 1 1 1 0 -1 -1 -1 Observe that the algorithm is in tail recursive manner, it can also be realized imperatively. The algorithm can be more generalized, so that it solve the puzzles of n frogs on each side. We represent the start state f-1, -1, ..., -1, 0, 1, 1, ..., 1g as s, and the mirrored end state as e. 1: function Solve(s; e) 2: S ffsgg 3: M 4: while S6= do 5: s1 Pop(S) 6: if s1[1] = e then 7: Add(M, Reverse(s1)) 8: else 9: for 8m 2 Moves(s1[1]) do 10: Push(S, fmg [ s1) 11: return M The possible moves can be also generalized with procedure Moves to han-dle arbitrary number of frogs. The following Python program implements this solution. def solve(start, end): stack = [[start]] s = [] while stack != []: c = stack.pop() if c[0] == end: s.append(reversed(c)) else: for m in moves(c[0]): stack.append([m]+c) return s def moves(s): ms = [] n = len(s) p = s.index(0)

2432. 490 CHAPTER 14. SEARCHING if p n - 2 and s[p+2] 0: ms.append(swap(s, p, p+2)) if p n - 1 and s[p+1] 0: ms.append(swap(s, p, p+1)) if p 1 and s[p-2] 0: ms.append(swap(s, p, p-2)) if p 0 and s[p-1] 0: ms.append(swap(s, p, p-1)) return ms def swap(s, i, j): a = s[:] (a[i], a[j]) = (a[j], a[i]) return a For 3 frogs in each side, we know that it takes 15 steps to exchange them. It's interesting to examine the table that how many steps are needed along with the number of frogs in each side. Our program gives the following result. number of frogs 1 2 3 4 5 ... number of steps 3 8 15 24 35 ... It seems that the number of steps are all square numbers minus one. It's natural to guess that the number of steps for n frogs in one side is (n+1)2 1. Actually we can prove it is true. Compare to the

2433. nal state and the start state, each frog moves ahead n + 1 stones in its opposite direction. Thus total 2n frogs move 2n(n + 1) stones. Another important fact is that each frog on the left has to meet every one on the right one time. And leap will happen when meets. Since the frog moves two stones ahead by leap, and there are total n2 meets happen, so that all these meets cause moving 2n2 stones ahead. The rest moves are not leap, but hop. The number of hops are 2n(n + 1) 2n2 = 2n. Sum up all n2 leaps and 2n hops, the total number of steps are n2 + 2n = (n + 1)2 1. Summary of DFS Observe the above three puzzles, although they vary in many aspects, their solutions show quite similar common structures. They all have some starting state. The maze starts from the entrance point; The 8 queens puzzle starts from the empty board; The leap frogs start from the state of 'AAAOBBB'. The solution is a kind of searching, at each attempt, there are several possible ways. For the maze puzzle, there are four dierent directions to try; For the 8 queens puzzle, there are eight columns to choose; For the leap frogs puzzle, there are four movements of leap or hop. We don't know how far we can go when make a decision, although the

2434. nal state is clear. For the maze, it's the exit point; For the 8 queens puzzle, we are done when all the 8 queens being assigned on the board; For the leap frogs puzzle, the

2435. nal state is that all frogs exchanged. We use a common approach to solve them. We repeatedly select one possible candidate to try, record where we've achieved; If we get stuck, we backtrack and try other options. We are sure by using this strategy, we can either

2436. nd a solution, or tell that the problem is unsolvable. Of course there can be some variation, that we can stop when

2437. nd one answer, or go on searching all the solutions.

2438. 14.3. SOLUTION SEARCHING 491 If we draw a tree rooted at the starting state, expand it so that every branch stands for a dierent attempt, our searching process is in a manner, that it searches deeper and deeper. We won't consider any other options in the same depth unless the searching fails so that we've to backtrack to upper level of the tree. Figure 14.32 illustrates the order we search a state tree. The arrow indicates how we go down and backtrack up. The number of the nodes shows the order we visit them. Figure 14.32: Example of DFS search order. This kind of search strategy is called 'DFS' (Deep-

2439. rst-search). We widely use it unintentionally. Some programming environments, Prolog for instance, adopt DFS as the default evaluation model. A maze is given by a set of rule base, such as: c(a, b). c(a, e). c(b, c). c(b, f). c(e, d), c(e, f). c(f, c). c(g, d). c(g, h). c(h, f). Where predicate c(X; Y ) means place X is connected with Y . Note that this is a directed predicate, we can make Y to be connected with X as well by either adding a symmetric rule, or create a undirected predicate. Figure 14.33 shows such a directed graph. Given two places X and Y , Prolog can tell if they are connected by the following program. go(X, X). go(X, Y) :- c(X, Z), go(Z, Y) This program says that, a place is connected with itself. Given two dierent places X and Y , if X is connected with Z, and Z is connected with Y , then X is connected with Y . Note that, there might be multiple choices for Z. Prolog selects a candidate, and go on further searching. It only tries other candidates if the recursive searching fails. In that case, Prolog backtracks and tries other alternatives. This is exactly what DFS does. DFS is quite straightforward when we only need a solution, but don't care if the solution takes the fewest steps. For example, the solution it gives, may not be the shortest path for the maze. We'll see some more puzzles next. They demands to

2440. nd the solution with the minimum attempts.

2441. 492 CHAPTER 14. SEARCHING a b e c g h f d Figure 14.33: A directed graph. The wolf, goat, and cabbage puzzle This puzzle says that a farmer wants to cross a river with a wolf, a goat, and a bucket of cabbage. There is a boat. Only the farmer can drive it. But the boat is small. it can only hold one of the wolf, the goat, and the bucket of cabbage with the farmer at a time. The farmer has to pick them one by one to the other side of the river. However, the wolf would eat the goat, and the goat would eat the cabbage if the farmer is absent. The puzzle asks to

2442. nd the fast solution so that they can all safely go cross the river. Figure 14.34: The wolf, goat, cabbage puzzle The key point to this puzzle is that the wolf does not eat the cabbage. The farmer can safely pick the goat to the other side. But next time, no matter if he pick the wolf or the cabbage to cross the river, he has to take one back to avoid the con ict. In order to

2443. nd the fast the solution, at any time, if the farmer has multiple options, we can examine all of them in parallel, so that these dierent decisions compete. If we count the number of the times the farmer cross the river without considering the direction, that crossing the river back and forth means 2 times, we are actually checking the complete possibilities after 1 time,

2444. 14.3. SOLUTION SEARCHING 493 2 times, 3 times, ... When we

2445. nd a situation, that they all arrive at the other bank, we are done. And this solution wins the competition, which is the fast solution. The problem is that we can't examine all the possible solutions in parallel ideally. Even with a super computer equipped with many CPU cores, the setup is too expensive to solve such a simple puzzle. Let's consider a lucky draw game. People blindly pick from a box with colored balls. There is only one black ball, all the others are white. The one who pick the black ball wins the game; Otherwise, he must return the ball to the box and wait for the next chance. In order to be fair enough, we can setup a rule that no one can try the second time before all others have tried. We can line people to a queue. Every time the

2446. rst guy pick a ball, if he does not win, he then stands at the tail of the queue to wait for the second try. This queue helps to ensure our rule. Figure 14.35: A lucky-draw game, the i-th person goes from the queue, pick a ball, then join the queue at tail if he fails to pick the black ball. We can use the quite same idea to solve our puzzle. The two banks of the river can be represented as two sets A and B. A contains the wolf, the goat, the cabbage and the farmer; while B is empty. We take an element from one set to the other each time. The two sets can't hold con ict things if the farmer is absent. The goal is to exchange the contents of A and B with fewest steps. We initialize a queue with state A = fw; g; c; pg;B = as the only element. As far as the queue isn't empty, we pick the

2447. rst element from the head, expand it with all possible options, and put these new expanded candidates to the tail of the queue. If the

2448. rst element on the head is the

2449. nal goal, that A = ;B = fw; g; c; pg, we are done. Figure 14.36 illustrates the idea of this search order. Note that as all possibilities in the same level are examined, there is no need for back-tracking.

2450. 494 CHAPTER 14. SEARCHING Figure 14.36: Start from state 1, check all possible options 2, 3, and 4 for next step; then all nodes in level 3, ... There is a simple way to treat the set. A four bits binary number can be used, each bit stands for a thing, for example, the wolf w = 1, the goat g = 2, the cabbage c = 4, and the farmer p = 8. That 0 stands for the empty set, 15 stands for a full set. Value 3, solely means there are a wolf and a goat on the river bank. In this case, the wolf will eat the goat. Similarly, value 6 stands for another con icting case. Every time, we move the highest bit (which is 8), or together with one of the other bits (4 or 2, or 1) from one number to the other. The possible moves can be de

2451. ned as below. mv(A;B) = f(A 8 i;B + 8 + i)ji 2 f0; 1; 2; 4g; i = 0 _ A^i6= 0g : B 8 f(A + 8 + i;B 8 i)ji 2 f0; 1; 2; 4g; i = 0 _ B^i6= 0g : Otherwise (14.59) Where ^ is the bitwise-and operation. the solution can be given by reusing the queue de

2452. ned in previous chapter. Denote the queue as Q, which is initialed with a singleton list f(15, 0)g. If Q is not empty, function DeQ(Q) extracts the head element M, the updated queue becomes Q0. M is a list of pairs, stands for a series of movements between the river banks. The

2453. rst element in m1 = (A0;B0) is the latest state. Function EnQ0(Q;L) is a slightly dierent enqueue operation. It pushes all the possible moving sequences in L to the tail of the queue one by one and returns the updated queue. With these notations, the solution function is de

2454. ned like below. solve(Q) = 8 : : Q = reverse(M) : A0 = 0 solve(EnQ0(Q0; fmg [Mj m 2 mv(m1); valid(m;M) )) : otherwise (14.60) Where function valid(m;M) checks if the new moving candidate m = (A00;B00) is valid. That neither A00 nor B00 is 3 or 6, and m hasn't been tried before in M to avoid any repeatedly attempts. valid(m;M) = A 006= 3;A 006= 6;B 006= 3;B 006= 6;m =2 M (14.61) The following example Haskell program implements this solution. Note that it uses a plain list to represent the queue for illustration purpose.

2455. 14.3. SOLUTION SEARCHING 495 import Data.Bits solve = bfsSolve [[(15, 0)]] where bfsSolve :: [[(Int, Int)]] ! [(Int, Int)] bfsSolve [] = [] -- no solution bfsSolve (c:cs) j (fst $ head c) == 0 = reverse c j otherwise = bfsSolve (cs ++ map (:c) (filter (`valid` c) $ moves $ head c)) valid (a, b) r = not $ or [ a èlem` [3, 6], b èlem` [3, 6], (a, b) èlem` r] moves (a, b) = if b 8 then trans a b else map swap (trans b a) where trans x y = [(x - 8 - i, y + 8 + i) j i [0, 1, 2, 4], i == 0 j j (x . i) == 0] swap (x, y) = (y, x) This algorithm can be easily modi

2456. ed to

2457. nd all the possible solutions, but not just stop after

2458. nding the

2459. rst one. This is left as the exercise to the reader. The following shows the two best solutions to this puzzle. Solution 1: Left river Right wolf, goat, cabbage, farmer wolf, cabbage goat, farmer wolf, cabbage, farmer goat cabbage wolf, goat, farmer goat, cabbage, farmer wolf goat wolf, cabbage, farmer goat, farmer wolf, cabbage wolf, goat, cabbage, farmer Solution 2: Left river Right wolf, goat, cabbage, farmer wolf, cabbage goat, farmer wolf, cabbage, farmer goat wolf goat, cabbage, farmer wolf, goat, farmer cabbage goat wolf, cabbage, farmer goat, farmer wolf, cabbage wolf, goat, cabbage, farmer This algorithm can also be realized imperatively. Observing that our solution is in tail recursive manner, we can translate it directly to a loop. We use a list S to hold all the solutions can be found. The singleton list f(15; 0)g is pushed to queue when initializing. As long as the queue isn't empty, we extract the head C from the queue by calling DeQ procedure. Examine if it reaches the

2460. nal goal, if not, we expand all the possible moves and push to the tail of the queue for further searching. 1: function Solve 2: S 3: Q 4: EnQ(Q; f(15; 0)g) 5: while Q6= do 6: C DeQ(Q)

2461. 496 CHAPTER 14. SEARCHING 7: if c1 = (0; 15) then 8: Add(S, Reverse(C)) 9: else 10: for 8m 2 Moves(C) do 11: if Valid(m;C) then 12: EnQ(Q; fmg [ C) 13: return S Where Moves, and Valid procedures are as same as before. The following Python example program implements this imperative algorithm. def solve(): s = [] queue = [[(0xf, 0)]] while queue != []: cur = queue.pop(0) if cur[0] == (0, 0xf): s.append(reverse(cur)) else: for m in moves(cur): queue.append([m]+cur) return s def moves(s): (a, b) = s[0] return valid(s, trans(a, b) if b 8 else swaps(trans(b, a))) def valid(s, mv): return [(a, b) for (a, b) in mv if a not in [3, 6] and b not in [3, 6] and (a, b) not in s] def trans(a, b): masks = [ 8 j (1i) for i in range(4)] return [(a ^ mask, b j mask) for mask in masks if a mask == mask] def swaps(s): return [(b, a) for (a, b) in s] There is a minor dierence between the program and the pseudo code, that the function to generate candidate moving options

2462. lters the invalid cases inside it. Every time, no matter the farmer drives the boat back and forth, there are m options for him to choose, where m is the number of objects on the river bank the farmer drives from. m is always less than 4, that the algorithm won't take more than n4 times at step n. This estimation is far more than the actual time, because we avoid trying all invalid cases. Our solution examines all the possible moving in the worst case. Because we check recorded steps to avoid repeated attempt, the algorithm takes about O(n2) time to search for n possible steps. Water jugs puzzle This is a popular puzzle in classic AI. The history of it should be very long. It says that there are two jugs, one is 9 quarts, the other is 4 quarts. How to use them to bring up from the river exactly 6 quarts of water?

2463. 14.3. SOLUTION SEARCHING 497 There are varies versions of this puzzle, although the volume of the jugs, and the target volume of water dier. The solver is said to be young Blaise Pascal when he was a child, the French mathematician, scientist in one story, and Simeon Denis Poisson in another story. Later in the popular Hollywood movie `Die-Hard 3', actor Bruce Willis and Samuel L. Jackson were also confronted with this puzzle. Polya gave a nice way to solve this problem backwards in [14]. Figure 14.37: Two jugs with volume of 9 and 4. Instead of thinking from the starting state as shown in

2464. gure 14.37. Polya pointed out that there will be 6 quarts of water in the bigger jugs at the

2465. nal stage, which indicates the second last step, we can

2466. ll the 9 quarts jug, then pour out 3 quarts from it. In order to achieve this, there should be 1 quart of water left in the smaller jug as shown in

2467. gure 14.38. Figure 14.38: The last two steps. It's easy to see that

2468. ll the 9 quarters jug, then pour to the 4 quarters jug twice can bring 1 quarters of water. As shown in

2469. gure 14.39. At this stage, we've found the solution. By reversing our

2470. ndings, we can give the correct steps to bring exactly 6 quarters of water. Polya's methodology is general. It's still hard to solve it without concrete algorithm. For instance, how to bring up 2 gallons of water from 899 and 1147 gallon jugs? There are 6 ways to deal with 2 jugs in total. Denote the smaller jug as A, the bigger jug as B.

2471. 498 CHAPTER 14. SEARCHING Figure 14.39: Fill the bigger jugs, and pour to the smaller one twice. Fill jug A from the river; Fill jug B from the river; Empty jug A; Empty jug B; Pour water from jug A to B; Pour water from jug B to A. The following sequence shows an example. Note that in this example, we assume that a b 2a. A B operation 0 0 start a 0

2472. ll A 0 a pour A into B a a

2473. ll A 2a - b b pour A into B 2a - b 0 empty B 0 2a - b pour A into B a 2a - b

2474. ll A 3a - 2b b pour A into B ... ... ... No matter what the above operations are taken, the amount of water in each jug can be expressed as xa + yb, where a and b are volumes of jugs, for some integers x and y. All the amounts of water we can get are linear combination of a and b. We can immediately tell given two jugs, if a goal g is solvable or not. For instance, we can't bring 5 gallons of water with two jugs of volume 4 and 6 gallon. The number theory ensures that, the 2 water jugs puzzle can be solved if and only if g can be divided by the greatest common divisor of a and b. Written as: gcd(a; b)jg (14.62) Where mjn means n can be divided by m. What's more, if a and b are rel-atively prime, which means gcd(a; b) = 1, it's possible to bring up any quantity g of water.

2475. 14.3. SOLUTION SEARCHING 499 Although gcd(a; b) enables us to determine if the puzzle is solvable, it doesn't give us the detailed pour sequence. If we can

2476. nd some integer x and y, so that g = xa + yb. We can arrange a sequence of operations (even it may not be the best solution) to solve it. The idea is that, without loss of generality, suppose x 0; y 0, we need

2477. ll jug A by x times, and empty jug B by y times in total. Let's take a = 3, b = 5, and g = 4 for example, since 4 = 3 3 5, we can arrange a sequence like the following. A B operation 0 0 start 3 0

2478. ll A 0 3 pour A into B 3 3

2479. ll A 1 5 pour A into B 1 0 empty B 0 1 pour A into B 3 1

2480. ll A 0 4 pour A into B In this sequence, we

2481. ll A by 3 times, and empty B by 1 time. The procedure can be described as the following: Repeat x times: 1. Fill jug A; 2. Pour jug A into jug B, whenever B is full, empty it. So the only problem left is to

2482. nd the x and y. There is a powerful tool in number theory called, Extended Euclid algorithm, which can achieve this. Compare to the classic Euclid GCD algorithm, which can only give the greatest common divisor, The extended Euclid algorithm can give a pair of x; y as well, so that: (d; x; y) = gcdext(a; b) (14.63) where d = gcd(a; b) and ax + by = d. Without loss of generality, suppose a b, there exits quotation q and remainder r that: b = aq + r (14.64) Since d is the common divisor, it can divide both a and b, thus d can divide r as well. Because r is less than a, we can scale down the problem by

2483. nding GCD of a and r: (d; x 0 ; y 0 ) = gcdext(r; a) (14.65) Where d = x0r + y0a according to the de

2484. nition of the extended Euclid algorithm. Transform b = aq + r to r = b aq, substitute r in above equation yields: d = x0(b aq) + y0a = (y0 x0q)a + x0b (14.66)

2485. 500 CHAPTER 14. SEARCHING This is the linear combination of a and b, so that we have: ( x = y0 x0 b a y = x0 (14.67) Note that this is a typical recursive relationship. The edge case happens when a = 0. gcd(0; b) = b = 0a + 1b (14.68) Summarize the above result, the extended Euclid algorithm can be de

2486. ned as the following: gcdext(a; b) = ( (b; 0; 1) : a = 0 (d; y0 x0 b a 0 ) : otherwise ; x (14.69) Where d, x0, y0 are de

2487. ned in equation (14.65). The 2 water jugs puzzle is almost solved, but there are still two detailed problems need to be tackled. First, extended Euclid algorithm gives the linear combination for the greatest common divisor d. While the target volume of water g isn't necessarily equal to d. This can be easily solved by multiplying x and y by m times, where m = g=gcd(a; b); Second, we assume x 0, to form a procedure to

2488. ll jug A with x times. However, the extended Euclid algorithm doesn't ensure x to be positive. For instance gcdext(4; 9) = (1;2; 1). Whenever we get a negative x, since d = xa + yb, we can continuously add b to x, and decrease y by a till a is greater than zero. At this stage, we are able to give the complete solution to the 2 water jugs puzzle. Below is an example Haskell program. extGcd 0 b = (b, 0, 1) extGcd a b = let (d, x', y') = extGcd (b `mod` a) a in (d, y' - x' (b `div` a), x') solve a b g j g `mod` d == 0 = [] -- no solution j otherwise = solve' (x g `div` d) where (d, x, y) = extGcd a b solve' x j x 0 = solve' (x + b) j otherwise = pour x [(0, 0)] pour 0 ps = reverse ((0, g):ps) pour x ps@((a', b'):_) j a' == 0 = pour (x - 1) ((a, b'):ps) -- fill a j b' == b = pour x ((a', 0):ps) -- empty b j otherwise = pour x ((max 0 (a' + b' - b), min (a' + b') b):ps) Although we can solve the 2 water jugs puzzle with extended Euclid al-gorithm, the solution may not be the best. For instance, when we are going to bring 4 gallons of water from 3 and 5 gallons jugs. The extended Euclid algorithm produces the following sequence: [(0,0),(3,0),(0,3),(3,3),(1,5),(1,0),(0,1),(3,1), (0,4),(3,4),(2,5),(2,0),(0,2),(3,2),(0,5),(3,5), (3,0),(0,3),(3,3),(1,5),(1,0),(0,1),(3,1),(0,4)]

2489. 14.3. SOLUTION SEARCHING 501 It takes 23 steps to achieve the goal, while the best solution only need 6 steps: [(0,0),(0,5),(3,2),(0,2),(2,0),(2,5),(3,4)] Observe the 23 steps, and we can

2490. nd that jug B has already contained 4 gallons of water at the 8-th step. But the algorithm ignores this fact and goes on executing the left 15 steps. The reason is that the linear combination x and y we

2491. nd with the extended Euclid algorithm are not the only numbers satisfying g = xa + by. For all these numbers, the smaller jxj + jyj, the less steps are needed. There is an exercise to addressing this problem in this section. The interesting problem is how to

2492. nd the best solution? We have two approaches, one is to

2493. nd x and y to minimize jxj + jyj; the other is to adopt the quite similar idea as the wolf-goat-cabbage puzzle. We focus on the latter in this section. Since there are at most 6 possible options:

2494. ll A,

2495. ll B, pour A into B, pour B into A, empty A and empty B, we can try them in parallel, and check which decision can lead to the best solution. We need record all the states we've achieved to avoid any potential repetition. In order to realize this parallel approach with reasonable resources, a queue can be used to arrange our attempts. The elements stored in this queue are series of pairs (p; q), where p and q represent the volume of waters contained in each jug. These pairs record the sequence of our operations from the beginning to the latest. We initialize the queue with the singleton list contains the starting state f(0; 0)g. 0ff(0; 0)gg (14.70) solve(a; b; g) = solve Every time, when the queue isn't empty, we pick a sequence from the head of the queue. If this sequence ends with a pair contains the target volume g, we

2496. nd a solution, we can print this sequence by reversing it; Otherwise, we expand the latest pair by trying all the possible 6 options, remove any duplicated states, and add them to the tail of the queue. Denote the queue as Q, the

2497. rst sequence stored on the head of the queue as S, the latest pair in S as (p; q), and the rest of pairs as S0. After popping the head element, the queue becomes Q0. This algorithm can be de

2498. ned like below: 0 (Q) = solve 8 : : Q = reverse(S) : p = g _ q = g solve0(EnQ0(Q0; ffs0g [ S0js0 2 try(S)g)) : otherwise (14.71) Where function EnQ0 pushes a list of sequence to the queue one by one. Function try(S) will try all possible 6 options to generate new pairs of water volumes: try(S) = fs 0js 0 2 8 : fillA(p; q); fillB(p; q); pourA(p; q); pourB(p; q); emptyA(p; q); emptyB(p; q) 9= 0 ;; s =2 S 0g (14.72) It's intuitive to de

2499. ne the 6 options. For

2500. ll operations, the result is that the volume of the

2501. lled jug is full; for empty operation, the result volume is empty;

2502. 502 CHAPTER 14. SEARCHING for pour operation, we need test if the jug is big enough to hold all the water. fillA(p; q) = (a; q) fillB(p; q) = (p; b) emptyA(p; q) = (0; q) emptyB(p; q) = (p; 0) pourA(p; q) = (max(0; p + q b); min(x + y; b)) pourB(p; q) = (min(x + y; a); max(0; x + y a)) (14.73) The following example Haskell program implements this method: solve' a b g = bfs [[(0, 0)]] where bfs [] = [] bfs (c:cs) j fst (head c) == g j j snd (head c) == g = reverse c j otherwise = bfs (cs ++ map (:c) (expand c)) expand ((x, y):ps) = filter (`notElem` ps) $ map (f ! f x y) [fillA, fillB, pourA, pourB, emptyA, emptyB] fillA _ y = (a, y) fillB x _ = (x, b) emptyA _ y = (0, y) emptyB x _ = (x, 0) pourA x y = (max 0 (x + y - b), min (x + y) b) pourB x y = (min (x + y) a, max 0 (x + y - a)) This method always returns the fast solution. It can also be realized in imperative approach. Instead of storing the complete sequence of operations in every element in the queue, we can store the unique state in a global history list, and use links to track the operation sequence, this can save spaces. (0, 0) (3, 0) fill A flll B (0, 5) (3, 5) fill B empty A (0, 0) pour A (0, 3) (3, 5) fill A empty B (0, 0) pour B (3, 2) ... (0, 0) (3, 0) (0, 5) (3, 5) (0, 3) (3, 2) ... Figure 14.40: All attempted states are stored in a global list. The idea is illustrated in

2503. gure 14.40. The initial state is (0, 0). Only `

2504. ll A' and `

2505. ll B' are possible. They are tried and added to the record list; Next we can try and record `

2506. ll B' on top of (3, 0), which yields new state (3, 5).

2507. 14.3. SOLUTION SEARCHING 503 However, when try `empty A' from state (3, 0), we would return to the start state (0, 0). As this previous state has been recorded, it is ignored. All the repeated states are in gray color in this

2508. gure. With such settings, we needn't remember the operation sequence in each element in the queue explicitly. We can add a `parent' link to each node in

2509. gure 14.40, and use it to back-traverse to the starting point from any state. The following example ANSI C code shows such a de

2510. nition. struct Step { int p, q; struct Step parent; }; struct Step make_step(int p, int q, struct Step parent) { struct Step s = (struct Step) malloc(sizeof(struct Step)); s!p = p; s!q = q; s!parent = parent; return s; } Where p; q are volumes of water in the 2 jugs. For any state s, de

2511. ne functions p(s) and q(s) return these 2 values, the imperative algorithm can be realized based on this idea as below. 1: function Solve(a; b; g) 2: Q 3: Push-and-record(Q, (0, 0)) 4: while Q6= do 5: s Pop(Q) 6: if p(s) = g _ q(s) = g then 7: return s 8: else 9: C Expand(s) 10: for 8c 2 C do 11: if c6= s ^ : Visited(c) then 12: Push-and-record(Q; c) 13: return NIL Where Push-and-record does not only push an element to the queue, but also record this element as visited, so that we can check if an element has been visited before in the future. This can be implemented with a list. All push operations append the new elements to the tail. For pop operation, instead of removing the element pointed by head, the head pointer only advances to the next one. This list contains historic data which has to be reset explicitly. The following ANSI C code illustrates this idea. struct Step steps[1000], head, tail = steps; void push(struct Step s) { tail++ = s; } struct Step pop() { return head++; } int empty() { return head == tail; }

2512. 504 CHAPTER 14. SEARCHING void reset() { struct Step p; for (p = steps; p != tail; ++p) free(p); head = tail = steps; } In order to test a state has been visited, we can traverse the list to compare p and q. int eq(struct Step a, struct Step b) { return a!p == b!p a!q == b!q; } int visited(struct Step s) { struct Step p; for (p = steps; p != tail; ++p) if (eq(p, s)) return 1; return 0; } The main program can be implemented as below: struct Step solve(int a, int b, int g) { int i; struct Step cur, cs[6]; reset(); push(make_step(0, 0, NULL)); while (!empty()) { cur = pop(); if (cur!p == g j j cur!q == g) return cur; else { expand(cur, a, b, cs); for (i = 0; i 6; ++i) if(!eq(cur, cs[i]) !visited(cs[i])) push(cs[i]); } } return NULL; } Where function expand tries all the 6 possible options: void expand(struct Step s, int a, int b, struct Step cs) { int p = s!p, q = s!q; cs[0] = make_step(a, q, s); =fill A= cs[1] = make_step(p, b, s); =fill B= cs[2] = make_step(0, q, s); =empty A= cs[3] = make_step(p, 0, s); =empty B= cs[4] = make_step(max(0, p + q - b), min(p + q, b), s); =pour A= cs[5] = make_step(min(p + q, a), max(0, p + q - a), s); =pour B= } And the result steps is back tracked in reversed order, it can be output with a recursive function:

2513. 14.3. SOLUTION SEARCHING 505 void print(struct Step s) { if (s) { print(s!parent); printf(%d, %dn, s!p, s!q); } } Kloski Kloski is a block sliding puzzle. It appears in many countries. There are dierent sizes and layouts. Figure 14.41 illustrates a traditional Kloski game in China. (a) Initial layout of blocks (b) Block layout after several movements Figure 14.41: `Huarong Dao', the traditional Kloski game in China. In this puzzle, there are 10 blocks, each is labeled with text or icon. The smallest block has size of 1 unit square, the biggest one is 22 units size. Note there is a slot of 2 units wide at the middle-bottom of the board. The biggest block represents a king in ancient time, while the others are enemies. The goal is to move the biggest block to the slot, so that the king can escape. This game is named as `Huarong Dao', or `Huarong Escape' in China. Figure 14.42 shows the similar Kloski puzzle in Japan. The biggest block means daughter, while the others are her family members. This game is named as `Daughter in the box' in Japan (Japanese name: hakoiri musume). Figure 14.42: `Daughter in the box', the Kloski game in Japan.

2514. 506 CHAPTER 14. SEARCHING In this section, we want to

2515. nd a solution, which can slide blocks from the initial state to the

2516. nal state with the minimum movements. The intuitive idea to model this puzzle is to use a 54 matrix representing the board. All pieces are labeled with a number. The following matrix M, for example, shows the initial state of the puzzle. M = 2 66664 1 10 10 2 1 10 10 2 3 4 4 5 3 7 8 5 6 0 0 9 3 77775 In this matrix, the cells of value i mean the i-th piece covers this cell. The special value 0 represents a free cell. By using sequence 1, 2, ... to identify pieces, a special layout can be further simpli

2517. ed as an array L. Each element is a list of cells covered by the piece indexed with this element. For example, L[4] = f(3; 2); (3; 3)g means the 4-th piece covers cells at position (3; 2) and (3; 3), where (i; j) means the cell at row i and column j. The starting layout can be written as the following Array. ff(1; 1); (2; 1)g; f(1; 4); (2; 4)g; f(3; 1); (4; 1)g; f(3; 2); (3; 3)g; f(3; 4); (4; 4)g; f(5; 1)g; f(4; 2)g; f(4; 3)g; f(5; 4)g; f(1; 2); (1; 3); (2; 2); (2; 3)gg When moving the Kloski blocks, we need examine all the 10 blocks, checking each block if it can move up, down, left and right. it seems that this approach would lead to a very huge amount of possibilities, because each step might have 10 4 options, there will be about 40n cases in the n-th step. Actually, there won't be so much options. For example, in the

2518. rst step, there are only 4 valid moving: the 6-th piece moves right; the 7-th and 8-th move down; and the 9-th moves left. All others are invalid moving. Figure 14.43 shows how to test if the moving is possible. The left example illustrates sliding block labeled with 1 down. There are two cells covered by this block. The upper 1 moves to the cell previously occupied by this same block, which is also labeled with 1; The lower 1 moves to a free cell, which is labeled with 0; The right example, on the other hand, illustrates invalid sliding. In this case, the upper cells could move to the cell occupied by the same block. However, the lower cell labeled with 1 can't move to the cell occupied by other block, which is labeled with 2. In order to test the valid moving, we need examine all the cells a block will cover. If they are labeled with 0 or a number as same as this block, the moving is valid. Otherwise it con icts with some other block. For a layout L, the corresponding matrix is M, suppose we want to move the k-th block with (x;y), where jxj 1; jyj 1. The following equation tells if the moving is valid: valid(L; k;x;y) : 8(i; j) 2 L[k] ) i0 = i + y; j0 = j + x; (1; 1) (i0; j0) (5; 4);Mi0j0 2 fk; 0g (14.74)

2519. 14.3. SOLUTION SEARCHING 507 Figure 14.43: Left: both the upper and the lower 1 are OK; Right: the upper 1 is OK, the lower 1 con icts with 2. Another important point to solve Kloski puzzle, is about how to avoid re-peated attempts. The obvious case is that after a series of sliding, we end up a matrix which have been transformed from. However, it is not enough to only avoid the same matrix. Consider the following two metrics. AlthoughM16= M2, we need drop options to M2, because they are essentially the same. M1 = 2 66664 1 10 10 2 1 10 10 2 3 4 4 5 3 7 8 5 6 0 0 9 3 77775 M2 = 2 66664 2 10 10 1 2 10 10 1 3 4 4 5 3 7 6 5 8 0 0 9 3 77775 This fact tells us, that we should compare the layout, but not merely matrix to avoid repetition. Denote the corresponding layouts as L1 and L2 respectively, it's easy to verify that jjL1jj = jjL2jj, where jjLjj is the normalized layout, which is de

2520. ned as below: jjLjj = sort(fsort(li)j8li 2 Lg) (14.75) In other words, a normalized layout is ordered for all its elements, and every element is also ordered. The ordering can be de

2521. ned as that (a; b) (c; d) , an + b cn + d, where n is the width of the matrix. Observing that the Kloski board is symmetric, thus a layout can be mirrored from another one. Mirrored layout is also a kind of repeating, which should be avoided. The following M1 and M2 show such an example.

2522. 508 CHAPTER 14. SEARCHING M1 = 2 66664 10 10 1 2 10 10 1 2 3 5 4 4 3 5 8 9 6 7 0 0 3 77775 M2 = 2 66664 3 1 10 10 3 1 10 10 4 4 2 5 7 6 2 5 0 0 9 8 3 77775 Note that, the normalized layouts are symmetric to each other. It's easy to get a mirrored layout like this: mirror(L) = ff(i; n j + 1)j8(i; j) 2 lgj8l 2 Lg (14.76) We

2523. nd that the matrix representation is useful in validating the moving, while the layout is handy to model the moving and avoid repeated attempt. We can use the similar approach to solve the Kloski puzzle. We need a queue, every element in the queue contains two parts: a series of moving and the latest layout led by the moving. Each moving is in form of (k; (y;x)), which means moving the k-th block, with y in row, and x in column in the board. The queue contains the starting layout when initialized. Whenever this queue isn't empty, we pick the

2524. rst one from the head, checking if the biggest block is on target, that L[10] = f(4; 2); (4; 3); (5; 2); (5; 3)g. If yes, then we are done; otherwise, we try to move every block with 4 options: left, right, up, and down, and store all the possible, unique new layout to the tail of the queue. During this searching, we need record all the normalized layouts we've ever found to avoid any duplication. Denote the queue as Q, the historic layouts as H, the

2525. rst layout on the head of the queue as L, its corresponding matrix as M. and the moving sequence to this layout as S. The algorithm can be de

2526. ned as the following. solve(Q;H) = 8 : : Q = reverse(S) : L[10] = f(4; 2); (4; 3); (5; 2); (5; 3)g solve(Q0;H0) : otherwise (14.77) The

2527. rst clause says that if the queue is empty, we've tried all the possibilities and can't

2528. nd a solution; The second clause

2529. nds a solution, it returns the moving sequence in reversed order; These are two edge cases. Otherwise, the algorithm expands the current layout, puts all the valid new layouts to the tail of the queue to yield Q0, and updates the normalized layouts to H0. Then it performs recursive searching. In order to expand a layout to valid unique new layouts, we can de

2530. ne a function as below: expand(L;H) = f(k; (y;x)j 8k 2 f1; 2; :::; 10g; 8(y;x) 2 f(0;1); (0; 1); (1; 0); (1; 0)g; valid(L; k;x;y); unique(L0;H)g (14.78) Where L0 is the the new layout by moving the k-th block with (y;x) from L, M0 is the corresponding matrix, and M00 is the matrix to the mirrored layout of L0. Function unique is de

2531. ned like this: 0 ;H) = M unique(L 0 =2 H ^M 00 =2 H (14.79)

2532. 14.3. SOLUTION SEARCHING 509 We'll next show some example Haskell Kloski programs. As array isn't mutable in the purely functional settings, tree based map is used to represent layout 10. Some type synonyms are de

2533. ned as below: import qualified Data.Map as M import Data.Ix import Data.List (sort) type Point = (Integer, Integer) type Layout = M.Map Integer [Point] type Move = (Integer, Point) data Ops = Op Layout [Move] The main program is almost as same as the sort(Q;H) function de

2534. ned above. solve :: [Ops] ! [[[Point]]]! [Move] solve [] _ = [] -- no solution solve (Op x seq : cs) visit j M.lookup 10 x == Just [(4, 2), (4, 3), (5, 2), (5, 3)] = reverse seq j otherwise = solve q visit' where ops = expand x visit visit' = map (layout move x) ops ++ visit q = cs ++ [Op (move x op) (op:seq) j op ops ] Where function layout gives the normalized form by sorting. move returns the updated map by sliding the i-th block with (y;x). layout = sort map sort M.elems move x (i, d) = M.update (Just map (flip shift d)) i x shift (y, x) (dy, dx) = (y + dy, x + dx) Function expand gives all the possible new options. It can be directly trans-lated from expand(L;H). expand :: Layout ! [[[Point]]] ! [Move] expand x visit = [(i, d) j i [1..10], d [(0, -1), (0, 1), (-1, 0), (1, 0)], valid i d, unique i d] where valid i d = all (p ! let p' = shift p d in inRange (bounds board) p' (M.keys $ M.filter (elem p') x) `elem` [[i], []]) (maybe [] id $ M.lookup i x) unique i d = let mv = move x (i, d) in all (`notElem` visit) (map layout [mv, mirror mv]) Note that we also

2535. lter out the mirrored layouts. The mirror function is given as the following. mirror = M.map (map ( (y, x) ! (y, 5 - x))) This program takes several minutes to produce the best solution, which takes 116 steps. The

2536. nal 3 steps are shown as below: 10Alternatively,

2537. nger tree based sequence shown in previous chapter can be used

2538. 510 CHAPTER 14. SEARCHING ... ['5', '3', '2', '1'] ['5', '3', '2', '1'] ['7', '9', '4', '4'] ['A', 'A', '6', '0'] ['A', 'A', '0', '8'] ['5', '3', '2', '1'] ['5', '3', '2', '1'] ['7', '9', '4', '4'] ['A', 'A', '0', '6'] ['A', 'A', '0', '8'] ['5', '3', '2', '1'] ['5', '3', '2', '1'] ['7', '9', '4', '4'] ['0', 'A', 'A', '6'] ['0', 'A', 'A', '8'] total 116 steps The Kloski solution can also be realized imperatively. Note that the solve(Q;H) is tail-recursive, it's easy to transform the algorithm with looping. We can also link one layout to its parent, so that the moving sequence can be recorded globally. This can save some spaces, as the queue needn't store the moving in-formation in every element. When output the result, we only need back-tracking to the starting layout from the last one. Suppose function Link(L0;L) links a new layout L0 to its parent layout L. The following algorithm takes a starting layout, and searches for best moving sequence. 1: function Solve(L0) 2: H jjL0jj 3: Q 4: Push(Q, Link(L0, NIL)) 5: while Q6= do 6: L Pop(Q) 7: if L[10] = f(4; 2); (4; 3); (5; 2); (5; 3)g then 8: return L 9: else 10: for each L0 2 Expand(L;H) do 11: Push(Q, Link(L0;L)) 12: Append(H; jjL0jj) 13: return NIL . No solution The following example Python program implements this algorithm: class Node: def __init__(self, l, p = None): self.layout = l self.parent = p

2539. 14.3. SOLUTION SEARCHING 511 def solve(start): visit = [normalize(start)] queue = [Node(start)] while queue != []: cur = queue.pop(0) layout = cur.layout if layout[-1] == [(4, 2), (4, 3), (5, 2), (5, 3)]: return cur else: for brd in expand(layout, visit): queue.append(Node(brd, cur)) visit.append(normalize(brd)) return None # no solution Where normalize and expand are implemented as below: def normalize(layout): return sorted([sorted(r) for r in layout]) def expand(layout, visit): def bound(y, x): return 1 y and y 5 and 1 x and x 4 def valid(m, i, y, x): return m[y - 1][x - 1] in [0, i] def unique(brd): (m, n) = (normalize(brd), normalize(mirror(brd))) return all(m != v and n != v for v in visit) s = [] d = [(0, -1), (0, 1), (-1, 0), (1, 0)] m = matrix(layout) for i in range(1, 11): for (dy, dx) in d: if all(bound(y + dy, x + dx) and valid(m, i, y + dy, x + dx) for (y, x) in layout[i - 1]): brd = move(layout, (i, (dy, dx))) if unique(brd): s.append(brd) return s Like most programming languages, arrays are indexed from 0 but not 1 in Python. This has to be handled properly. The rest functions including mirror, matrix, and move are implemented as the following. def mirror(layout): return [[(y, 5 - x) for (y, x) in r] for r in layout] def matrix(layout): m = [[0]4 for _ in range(5)] for (i, ps) in zip(range(1, 11), layout): for (y, x) in ps: m[y - 1][x - 1] = i return m def move(layout, delta): (i, (dy, dx)) = delta

2540. 512 CHAPTER 14. SEARCHING m = dup(layout) m[i - 1] = [(y + dy, x + dx) for (y, x) in m[i - 1]] return m def dup(layout): return [r[:] for r in layout] It's possible to modify this Kloski algorithm, so that it does not only stop at the fast solution, but also search all the solutions. In such case, the compu-tation time is bound to the size of a space V , where V holds all the layouts can be transformed from the starting layout. If all these layouts are stored glob-ally, with a parent

2541. eld point to the predecessor, the space requirement of this algorithm is also bound to O(V ). Summary of BFS The above three puzzles, the wolf-goat-cabbage puzzle, the water jugs puzzle, and the Kloski puzzle show some common solution structure. Similar to the DFS problems, they all have the starting state and the end state. The wolf-goat- cabbage puzzle starts with the wolf, the goat, the cabbage and the farmer all in one side, while the other side is empty. It ends up in a state that they all moved to the other side. The water jugs puzzle starts with two empty jugs, and ends with either jug contains a certain volume of water. The Kloski puzzle starts from a layout and ends to another layout that the biggest block begging slided to a given position. All problems specify a set of rules which can transfer from one state to another. Dierent form the DFS approach, we try all the possible options `in parallel'. We won't search further until all the other alternatives in the same step have been examined. This method ensures that the solution with the minimum steps can be found before those with more steps. Review and compare the two

2542. gures we've drawn before shows the dierence between these two approaches. For the later one, because we expand the searching horizontally, it is called as Breadth-

2543. rst search (BFS for short). (a) Depth First Search (b) Breadth First Search Figure 14.44: Search orders for DFS and BFS. As we can't perform search really in parallel, BFS realization typically uti-lizes a queue to store the search options. The candidate with less steps pops

2544. 14.3. SOLUTION SEARCHING 513 from the head, while the new candidate with more steps is pushed to the tail of the queue. Note that the queue should meet constant time enqueue and de-queue requirement, which we've explained in previous chapter of queue. Strictly speaking, the example functional programs shown above don't meet this crite-ria. They use list to mimic queue, which can only provide linear time pushing. Readers can replace them with the functional queue we explained before. BFS provides a simple method to search for optimal solutions in terms of the number of steps. However, it can't search for more general optimal solution. Consider another directed graph as shown in

2545. gure 14.45, the length of each section varies. We can't use BFS to

2546. nd the shortest route from one city to another. a b 1 5 e 4 7 f c 1 1 1 0 9 d 5 6 g h 8 1 2 Figure 14.45: A weighted directed graph. Note that the shortest route from city a to city c isn't the one with the fewest steps a ! b ! c. The total length of this route is 22; But the route with more steps a ! e ! f ! c is the best. The length of it is 20. The coming sections introduce other algorithms to search for optimal solution. 14.3.2 Search the optimal solution Searching for the optimal solution is quite important in many aspects. People need the `best' solution to save time, space, cost, or energy. However, it's not easy to

2547. nd the best solution with limited resources. There have been many optimal problems can only be solved by brute-force. Nevertheless, we've found that, for some of them, There exists special simpli

2548. ed ways to search the optimal solution. Grady algorithm Human coding Human coding is a solution to encode information with the shortest length of code. Consider the popular ASCII code, which uses 7 bits to encode characters, digits, and symbols. ASCII code can represent 27 = 128 dierent symbols. With 0, 1 bits, we need at least log2 n bits to distinguish n dierent symbols. For text with only case insensitive English letters, we can de

2549. ne a code table like below.

2550. 514 CHAPTER 14. SEARCHING char code char code A 00000 N 01101 B 00001 O 01110 C 00010 P 01111 D 00011 Q 10000 E 00100 R 10001 F 00101 S 10010 G 00110 T 10011 H 00111 U 10100 I 01000 V 10101 J 01001 W 10110 K 01010 X 10111 L 01011 Y 11000 M 01100 Z 11001 With this code table, text `INTERNATIONAL' is encoded to 65 bits. 00010101101100100100100011011000000110010001001110101100000011010 Observe the above code table, which actually maps the letter `A' to 'Z' from 0 to 25. There are 5 bits to represent every code. Code zero is forced as '00000' but not '0' for example. Such kind of coding method, is called

2551. xed-length coding. Another coding method is variable-length coding. That we can use just one bit `0' for À', two bits `10' for C, and 5 bits `11001' for `Z'. Although this approach can shorten the total length of the code for ÌNTERNATIONAL' from 65 bits dramatically, it causes problem when decoding. When processing a sequence of bits like `1101', we don't know if it means `1' followed by `101', which stands for `BF'; or `110' followed by `1', which is `GB', or `1101' which is `N'. The famous Morse code is variable-length coding system. That the most used letter È' is encoded as a dot, while `Z' is encoded as two dashes and two dots. Morse code uses a special pause separator to indicate the termination of a code, so that the above problem won't happen. There is another solution to avoid ambiguity. Consider the following code table. char code char code A 110 E 1110 I 101 L 1111 N 01 O 000 R 001 T 100 Text ÌNTERNATIONAL' is encoded to 38 bits only: 10101100111000101110100101000011101111 If decode the bits against the above code table, we won't meet any ambiguity symbols. This is because there is no code for any symbol is the pre

2552. x of another one. Such code is called pre

2553. x-code. (You may wonder why it isn't called as non-pre

2554. x code.) By using pre

2555. x-code, we needn't separators at all. So that the length of the code can be shorten. This is a very interesting problem. Can we

2556. nd a pre

2557. x-code table, which produce the shortest code for a given text? The very same problem was given to David A. Human in 1951, who was still a student in MIT[15]. His professor

2558. 14.3. SOLUTION SEARCHING 515 Robert M. Fano told the class that those who could solve this problem needn't take the

2559. nal exam. Human almost gave up and started preparing the

2560. nal exam when he found the most ecient answer. The idea is to create the coding table according to the frequency of the symbol appeared in the text. The more used symbol is assigned with the shorter code. It's not hard to process some text, and calculate the occurrence for each symbol. So that we have a symbol set, each one is augmented with a weight. The weight can be the number which indicates the frequency this symbol occurs. We can use the number of occurrence, or the probabilities for example. Human discovered that a binary tree can be used to generate pre

2561. x-code. All symbols are stored in the leaf nodes. The codes are generated by traversing the tree from root. When go left, we add a zero; and when go right we add a one. Figure 14.46 illustrates a binary tree. Taking symbol 'N' for example, start-ing from the root, we

2562. rst go left, then right and arrive at 'N'. Thus the code for 'N' is '01'; While for symbol 'A', we can go right, right, then left. So 'A' is encode to '110'. Note that this approach ensures none code is the pre

2563. x of the other. 1 3 5 8 2 N, 3 O, 1 R, 1 4 4 T, 2 I, 2 A, 2 2 E, 1 L, 1 Figure 14.46: An encoding tree. Note that this tree can also be used directly for decoding. When scan a series of bits, if the bit is zero, we go left; if the bit is one, we go right. When arrive at a leaf, we decode a symbol from that leaf. And we restart from the root of the tree for the coming bits. Given a list of symbols with weights, we need build such a binary tree, so that the symbol with greater weight has shorter path from the root. Human developed a bottom-up solution. When start, all symbols are put into a leaf node. Every time, we pick two nodes, which has the smallest weight, and merge them into a branch node. The weight of this branch is the sum of its two children. We repeatedly pick the two smallest weighted nodes and merge till there is only one tree left. Figure 14.47 illustrates such a building process. We can reuse the binary tree de

2564. nition to formalize Human coding. We

2565. 516 CHAPTER 14. SEARCHING 2 E, 1 L, 1 (a) 1. 2 O, 1 R, 1 (b) 2. 4 T, 2 I, 2 (c) 3. 4 A, 2 2 E, 1 L, 1 (d) 4. 5 2 N, 3 O, 1 R, 1 (e) 5. 8 4 4 T, 2 I, 2 A, 2 2 E, 1 L, 1 (f) 6. 1 3 5 8 2 N, 3 O, 1 R, 1 4 4 T, 2 I, 2 A, 2 2 E, 1 L, 1 (g) 7. Figure 14.47: Steps to build a Human tree.

2566. 14.3. SOLUTION SEARCHING 517 augment the weight information, and the symbols are only stored in leaf nodes. The following C like de

2567. nition, shows an example. struct Node { int w; char c; struct Node left, right; }; Some limitation can be added to the de

2568. nition, as empty tree isn't allowed. A Human tree is either a leaf, which contains a symbol and its weight; or a branch, which only holds total weight of all leaves. The following Haskell code, for instance, explicitly speci

2569. es these two cases. data HTr w a = Leaf w a j Branch w (HTr w a) (HTr w a) When merge two Human trees T1 and T2 to a bigger one, These two trees are set as its children. We can select either one as the left, and the other as the right. the weight of the result tree T is the sum of its two children. so that w = w1 + w2. De

2570. ne T1 T2 if w1 w2, One possible Human tree building algorithm can be realized as the following. build(A) = T1 : A = fT1g build(fmerge(Ta; Tb)g [ A0) : otherwise (14.80) A is a list of trees. It is initialized as leaves for all symbols and their weights. If there is only one tree in this list, we are done, the tree is the

2571. nal Human tree. Otherwise, The two smallest tree Ta and Tb are extracted, and the rest trees are hold in list A0. Ta and Tb are merged to one bigger tree, and put back to the tree list for further recursive building. (Ta; Tb;A 0 ) = extract(A) (14.81) We can scan the tree list to extract the 2 nodes with the smallest weight. Be-low equation shows that when the scan begins, the

2572. rst 2 elements are compared and initialized as the two minimum ones. An empty accumulator is passed as the last argument. 0 (min(T1; T2); max(T1; T2); fT3; T4; :::g; ) (14.82) extract(A) = extract For every tree, if its weight is less than the smallest two we've ever found, we update the result to contain this tree. For any given tree list A, denote the

2573. rst tree in it as T1, and the rest trees except T1 as A0. The scan process can be de

2574. ned as the following. 0 (Ta; Tb; A;B) = extract 8 : (Ta; Tb;B) : A = extract0(T0 a; T0 b;A0; fTbg [ A) : T1 Tb extract0(Ta; Tb;A0; fT1g [ A) : otherwise (14.83) Where T0 a = min(T1; Ta), T0 b = max(T1; Ta) are the updated two trees with the smallest weights. The following Haskell example program implements this Human tree build-ing algorithm.

2575. 518 CHAPTER 14. SEARCHING build [x] = x build xs = build ((merge x y) : xs') where (x, y, xs') = extract xs extract (x:y:xs) = min2 (min x y) (max x y) xs [] where min2 x y [] xs = (x, y, xs) min2 x y (z:zs) xs j z y = min2 (min z x) (max z x) zs (y:xs) j otherwise = min2 x y zs (z:xs) This building solution can also be realized imperatively. Given an array of Human nodes, we can use the last two cells to hold the nodes with the smallest weights. Then we scan the rest of the array from right to left. Whenever there is a node with the smaller weight, this node will be exchanged with the bigger one of the last two. After all nodes have been examined, we merge the trees in the last two cells, and drop the last cell. This shrinks the array by one. We repeat this process till there is only one tree left. 1: function Huffman(A) 2: while jAj 1 do 3: n jAj 4: for i n 2 down to 1 do 5: if A[i] Max(A[n];A[n 1]) then 6: Exchange A[i] $ Max(A[n];A[n 1]) 7: A[n 1] Merge(A[n];A[n 1]) 8: Drop(A[n]) 9: return A[1] The following C++ example program implements this algorithm. Note that this algorithm needn't the last two elements being ordered. typedef vectorNode Nodes; bool lessp(Node a, Node b) { return a!w b!w; } Node max(Node a, Node b) { return lessp(a, b) ? b : a; } void swap(Nodes ts, int i, int j, int k) { swap(ts[i], ts[ts[j] ts[k] ? k : j]); } Node huffman(Nodes ts) { int n; while((n = ts.size()) 1) { for (int i = n - 3; i 0; --i) if (lessp(ts[i], max(ts[n-1], ts[n-2]))) swap(ts, i, n-1, n-2); ts[n-2] = merge(ts[n-1], ts[n-2]); ts.pop_back(); } return ts.front(); } The algorithm merges all the leaves, and it need scan the list in each iteration. Thus the performance is quadratic. This algorithm can be improved. Observe that each time, only the two trees with the smallest weights are merged. This

2576. 14.3. SOLUTION SEARCHING 519 reminds us the heap data structure. Heap ensures to access the smallest element fast. We can put all the leaves in a heap. For binary heap, this is typically a linear operation. Then we extract the minimum element twice, merge them, then put the bigger tree back to the heap. This is O(lg n) operation if binary heap is used. So the total performance is O(n lg n), which is better than the above algorithm. The next algorithm extracts the node from the heap, and starts Human tree building. build(H) = reduce(top(H); pop(H)) (14.84) This algorithm stops when the heap is empty; Otherwise, it extracts another nodes from the heap for merging. reduce(T;H) = T : H = build(insert(merge(T; top(H)); pop(H))) : otherwise (14.85) Function build and reduce are mutually recursive. The following Haskell example program implements this algorithm by using heap de

2577. ned in previous chapter. huffman' :: (Num a, Ord a) ) [(b, a)] ! HTr a b huffman' = build' Heap.fromList map ((c, w) ! Leaf w c) where build' h = reduce (Heap.findMin h) (Heap.deleteMin h) reduce x Heap.E = x reduce x h = build' $ Heap.insert (Heap.deleteMin h) (merge x (Heap.findMin h)) The heap solution can also be realized imperatively. The leaves are

2578. rstly transformed to a heap, so that the one with the minimum weight is put on the top. As far as there are more than 1 elements in the heap, we extract the two smallest, merge them to a bigger one, and put back to the heap. The

2579. nal tree left in the heap is the result Human tree. 1: function Huffman'(A) 2: Build-Heap(A) 3: while jAj 1 do 4: Ta Heap-Pop(A) 5: Tb Heap-Pop(B) 6: Heap-Push(A, Merge(Ta; Tb)) 7: return Heap-Pop(A) The following example C++ code implements this heap solution. The heap used here is provided in the standard library. Because the max-heap, but not min-heap would be made by default, a greater predication is explicitly passed as argument. bool greaterp(Node a, Node b) { return b!w a!w; } Node pop(Nodes h) { Node m = h.front(); pop_heap(h.begin(), h.end(), greaterp); h.pop_back(); return m; }

2580. 520 CHAPTER 14. SEARCHING void push(Node t, Nodes h) { h.push_back(t); push_heap(h.begin(), h.end(), greaterp); } Node huffman1(Nodes ts) { make_heap(ts.begin(), ts.end(), greaterp); while (ts.size() 1) { Node t1 = pop(ts); Node t2 = pop(ts); push(merge(t1, t2), ts); } return ts.front(); } When the symbol-weight list has been already sorted, there exists a linear time method to build the Human tree. Observe that during the Human tree building, it produces a series of merged trees with weight in ascending order. We can use a queue to manage the merged trees. Every time, we pick the two trees with the smallest weight from both the queue and the list, merge them and push the result to the queue. All the trees in the list will be processed, and there will be only one tree left in the queue. This tree is the result Human tree. This process starts by passing an empty queue as below. build 0 (A) = reduce 0 (extract 00 (;A)) (14.86) Suppose A is in ascending order by weight, At any time, the tree with the smallest weight is either the header of the queue, or the

2581. rst element of the list. Denote the header of the queue is Ta, after pops it, the queue is Q0; The

2582. rst element in A is Tb, the rest elements are hold in A0. Function extract00 can be de

2583. ned like the following. 00 (Q;A) = extract 8 : (Tb; (Q;A0)) : Q = (Ta; (Q0;A)) : A = _ Ta Tb (Tb; (Q;A0)) : otherwise (14.87) Actually, the pair of queue and tree list can be viewed as a special heap. The tree with the minimum weight is continuously extracted and merged. reduce0 (T; (Q;A)) = T : Q = ^ A = reduce0(extract00(push(Q00; merge(T; T0));A00)) : otherwise (14.88) Where (T0; (Q00;A00)) = extract00(Q;A), which means extracting another tree. The following Haskell example program shows the implementation of this method. Note that this program explicitly sort the leaves, which isn't necessary if the leaves are ordered. Again, the list, but not a real queue is used here for illustration purpose. List isn't good at pushing new element, please refer to the chapter of queue for details about it. huffman'' :: (Num a, Ord a) ) [(b, a)] ! HTr a b huffman'' = reduce wrap sort map ((c, w) ! Leaf w c) where

2584. 14.3. SOLUTION SEARCHING 521 wrap xs = delMin ([], xs) reduce (x, ([], [])) = x reduce (x, h) = let (y, (q, xs)) = delMin h in reduce $ delMin (q ++ [merge x y], xs) delMin ([], (x:xs)) = (x, ([], xs)) delMin ((q:qs), []) = (q, (qs, [])) delMin ((q:qs), (x:xs)) j q x = (q, (qs, (x:xs))) j otherwise = (x, ((q:qs), xs)) This algorithm can also be realized imperatively. 1: function Huffman(A) . A is ordered by weight 2: Q 3: T Extract(Q;A) 4: while Q6= _ A6= do 5: Push(Q, Merge(T, Extract(Q;A))) 6: T Extract(Q;A) 7: return T Where function Extract(Q;A) extracts the tree with the smallest weight from the queue and the list. It mutates the queue and tree if necessary. Denote the head of the queue is Ta, and the

2585. rst element of the list as Tb. 1: function Extract(Q;A) 2: if Q6= ^ (A = _ Ta Tb) then 3: return Pop(Q) 4: else 5: return Detach(A) Where procedure Detach(A), removes the

2586. rst element from A, and returns this element as result. In most imperative settings, as detaching the

2587. rst element is slow linear operation for array, we can store the trees in descending order by weight, and remove the last element. This is a fast constant time operation. The below C++ example code shows this idea. Node extract(queueNode q, Nodes ts) { Node t; if (!q.empty() (ts.empty() j j lessp(q.front(), ts.back()))) { t = q.front(); q.pop(); } else { t = ts.back(); ts.pop_back(); } return t; } Node huffman2(Nodes ts) { queueNode q; sort(ts.begin(), ts.end(), greaterp); Node t = extract(q, ts); while (!q.empty() j j !ts.empty()) { q.push(merge(t, extract(q, ts))); t = extract(q, ts); } return t;

2588. 522 CHAPTER 14. SEARCHING } Note that the sorting isn't necessary if the trees have already been ordered. It can be a linear time reversing in case the trees are in ascending order by weight. There are three dierent Human man tree building methods explained. Although they follow the same approach developed by Human, the result trees varies. Figure 14.48 shows the three dierent Human trees built with these methods. 1 3 5 8 A, 2 N, 3 4 4 2 T, 2 L, 1 E, 1 2 I, 2 O, 1 R, 1 (a) Created by scan method. 1 3 5 8 2 N, 3 O, 1 R, 1 4 4 T, 2 I, 2 A, 2 2 E, 1 L, 1 (b) Created by heap method. 1 3 5 8 2 N, 3 O, 1 R, 1 4 4 A, 2 I, 2 T, 2 2 E, 1 L, 1 (c) Linear time building for sorted list. Figure 14.48: Variation of Human trees for the same symbol list. Although these three trees are not identical. They are all able to generate the most ecient code. The formal proof is skipped here. The detailed information can be referred to [15] and Section 16.3 of [2]. The Human tree building is the core idea of Human coding. Many things can be easily achieved with the Human tree. For example, the code table can be generated by traversing the tree. We start from the root with the empty pre

2589. x p. For any branches, we append a zero to the pre

2590. x if turn left, and append a one if turn right. When a leaf node is arrived, the symbol represented by this node and the pre

2591. x are put to the code table. Denote the symbol of a leaf node as c, the children for tree T as Tl and Tr respectively. The code table association list can be built with code(T; ), which is de

2592. ned as below. code(T; p) = f(c; p)g : leaf(T) code(Tl; p [ f0g) [ code(Tr; p [ f1g) : otherwise (14.89) Where function leaf(T) tests if tree T is a leaf or a branch node. The

2593. 14.3. SOLUTION SEARCHING 523 following Haskell example program generates a map as the code table according to this algorithm. code tr = Map.fromList $ traverse [] tr where traverse bits (Leaf _ c) = [(c, bits)] traverse bits (Branch _ l r) = (traverse (bits ++ [0]) l) ++ (traverse (bits ++ [1]) r) The imperative code table generating algorithm is left as exercise. The encoding process can scan the text, and look up the code table to output the bit sequence. The realization is skipped here. The decoding process is realized by looking up the Human tree according to the bit sequence. We start from the root, whenever a zero is received, we turn left, otherwise if a one is received, we turn right. If a leaf node is arrived, the symbol represented by this leaf is output, and we start another looking up from the root. The decoding process ends when all the bits are consumed. Denote the bit sequence as B = fb1; b2; :::g, all bits except the

2594. rst one are hold in B0, below de

2595. nition realizes the decoding algorithm. decode(T;B) = 8 : fcg : B = ^ leaf(T) fcg [ decode(root(T);B) : leaf(T) decode(Tl;B0) : b1 = 0 decode(Tr;B0) : otherwise (14.90) Where root(T) returns the root of the Human tree. The following Haskell example code implements this algorithm. decode tr cs = find tr cs where find (Leaf _ c) [] = [c] find (Leaf _ c) bs = c : find tr bs find (Branch _ l r) (b:bs) = find (if b == 0 then l else r) bs Note that this is an on-line decoding algorithm with linear time performance. It consumes one bit per time. This can be clearly noted from the below imper-ative realization, where the index keeps increasing by one. 1: function Decode(T;B) 2: W 3: n jBj; i 1 4: while i n do 5: R T 6: while : Leaf(R) do 7: if B[i] = 0 then 8: R Left(R) 9: else 10: R Right(R) 11: i i + 1 12: W W[ Symbol(R) 13: return W This imperative algorithm can be implemented as the following example C++ program. string decode(Node root, const char bits) {

2596. 524 CHAPTER 14. SEARCHING string w; while (bits) { Node t = root; while (!isleaf(t)) t = '0' == bits++ ? t!left : t!right; w += t!c; } return w; } Human coding, especially the Human tree building shows an interesting strategy. Each time, there are multiple options for merging. Among the trees in the list, Human method always selects two trees with the smallest weight. This is the best choice at that merge stage. However, these series of local best options generate a global optimal pre

2597. x code. It's not always the case that the local optimal choice also leads to the global optimal solution. In most cases, it doesn't. Human coding is a special one. We call the strategy that always choosing the local best option as greedy strategy. Greedy method works for many problems. However, it's not easy to tell if the greedy method can be applied to get the global optimal solution. The generic formal proof is still an active research area. Section 16.4 in [2] provides a good treatment for Matroid tool, which covers many problems that greedy algorithm can be applied. Change-making problem We often change money when visiting other countries. People tend to use credit card more often nowadays than before, because it's quite convenient to buy things without considering much about changes. If we changed some money in the bank, there are often some foreign money left by the end of the trip. Some people like to change them to coins for collection. Can we

2598. nd a solution, which can change the given amount of money with the least number of coins? Let's use USA coin system for example. There are 5 dierent coins: 1 cent, 5 cent, 25 cent, 50 cent, and 1 dollar. A dollar is equal to 100 cents. Using the greedy method introduced above, we can always pick the largest coin which is not greater than the remaining amount of money to be changed. Denote list C = f1; 5; 25; 50; 100g, which stands for the value of coins. For any given money X, the change coins can be generated as below. change(X;C) = 8 : : X = 0 fcmg [ change(X cm;C) : otherwise; cm = max(fc 2 C; c Xg) (14.91) If C is in descending order, cm can be found as the

2599. rst one not greater than X. If we want to change 1.42 dollar, This function produces a coin list of f100; 25; 5; 5; 5; 1; 1g. The output coins list can be easily transformed to contain pairs f(100; 1); (25; 1); (5; 3); (1; 2)g. That we need one dollar, a quarter, three coins of 5 cent, and 2 coins of 1 cent to make the change. The following Haskell example program outputs result as such. solve x = assoc change x where

2600. 14.3. SOLUTION SEARCHING 525 change 0 _ = [] change x cs = let c = head $ filter ( x) cs in c : change (x - c) cs assoc = (map (cs ! (head cs, length cs))) group As mentioned above, this program assumes the coins are in descending order, for instance like below. solve 142 [100, 50, 25, 5, 1] This algorithm is tail recursive, it can be transformed to a imperative loop-ing. 1: function Change(X;C) 2: R 3: while X6= 0 do 4: cm = max(fc 2 C; c Xg) 5: R fcmg [ R 6: X X cm 7: return R The following example Python program implements this imperative version and manages the result with a dictionary. def change(x, coins): cs = {} while x != 0: m = max([c for c in coins if c x]) cs[m] = 1 + cs.setdefault(m, 0) x = x - m return cs For a coin system like USA, the greedy approach can

2601. nd the optimal so-lution. The amount of coins is the minimum. Fortunately, our greedy method works in most countries. But it is not always true. For example, suppose a country have coins of value 1, 3, and 4 units. The best change for value 6, is to use two coins of 3 units, however, the greedy method gives a result of three coins: one coin of 4, two coins of 1. Which isn't the optimal result. Summary of greedy method As shown in the change making problem, greedy method doesn't always give the best result. In order to

2602. nd the optimal solution, we need dynamic programming which will be introduced in the next section. However, the result is often good enough in practice. Let's take the word-wrap problem for example. In modern software editors and browsers, text spans to multiple lines if the length of the content is too long to be hold. With word-wrap supported, user needn't hard line breaking. Although dynamic program-ming can wrap with the minimum number of lines, it's overkill. On the contrary, greedy algorithm can wrap with lines approximate to the optimal result with quite eective realization as below. Here it wraps text T, not to exceeds line width W, with space s between each word. 1: L W 2: for w 2 T do 3: if jwj + s L then

2603. 526 CHAPTER 14. SEARCHING 4: Insert line break 5: L W jwj 6: else 7: L L jwj s For each word w in the text, it uses a greedy strategy to put as many words in a line as possible unless it exceeds the line width. Many word processors use a similar algorithm to do word-wrapping. There are many cases, the strict optimal result, but not the approximate one is necessary. Dynamic programming can help to solve such problems. Dynamic programming In the change-making problem, we mentioned the greedy method can't always give the optimal solution. For any coin system, are there any way to

2604. nd the best changes? Suppose we have

2605. nd the best solution which makes X value of money. The coins needed are contained in Cm. We can partition these coins into two collections, C1 and C2. They make money of X1, and X2 respectively. We'll prove that C1 is the optimal solution for X1, and C2 is the optimal solution for X2. Proof. For X1, Suppose there exists another solution C0 1, which uses less coins than C1. Then changing solution C0 1 [ C2 uses less coins to make X than Cm. This is con ict with the fact that Cm is the optimal solution to X. Similarity, we can prove C2 is the optimal solution to X2. Note that it is not true in the reverse situation. If we arbitrary select a value Y X, divide the original problem to

2606. nd the optimal solutions for sub problems Y and X Y . Combine the two optimal solutions doesn't necessarily yield optimal solution for X. Consider this example. There are coins with value 1, 2, and 4. The optimal solution for making value 6, is to use 2 coins of value 2, and 4; However, if we divide 6 = 3+3, since each 3 can be made with optimal solution 3 = 1 + 2, the combined solution contains 4 coins (1 + 1 + 2 + 2). If an optimal problem can be divided into several sub optimal problems, we call it has optimal substructure. We see that the change-making problem has optimal substructure. But the dividing has to be done based on the coins, but not with an arbitrary value. The optimal substructure can be expressed recursively as the following. change(X) = : X = 0 least(fc [ change(X c)jc 2 C; c Xg) : otherwise (14.92) For any coin system C, the changing result for zero is empty; otherwise, we check every candidate coin c, which is not greater then value X, and recursively

2607. nd the best solution for X c; We pick the coin collection which contains the least coins as the result. Below Haskell example program implements this top-down recursive solu-tion.

2608. 14.3. SOLUTION SEARCHING 527 change _ 0 = [] change cs x = minimumBy (compare `on` length) [c:change cs (x - c) j c cs, c x] Although this program outputs correct answer [2, 4] when evaluates change [1, 2, 4] 6, it performs very bad when changing 1.42 dollar with USA coins system. It failed to

2609. nd the answer within 15 minutes in a computer with 2.7GHz CPU and 8G memory. The reason why it's slow is because there are a lot of duplicated computing in the top-down recursive solution. When it computes change(142), it needs to examine change(141); change(137); change(117); change(92), and change(42). While change(141) next computes to smaller values by deducing with 1, 2, 25, 50 and 100 cents. it will eventually meets value 137, 117, 92, and 42 again. The search space explodes with power of 5. This is quite similar to compute Fibonacci numbers in a top-down recursive way. Fn = 1 : n = 1 _ n = 2 Fn1 + Fn2 : otherwise (14.93) When we calculate F8 for example, we recursively calculate F7 and F6. While when we calculate F7, we need calculate F6 again, and F5, ... As shown in the below expand forms, the calculation is doubled every time, and the same value is calculate again and again. F8 = F7 + F6 = F6 + F5 + F5 + F4 = F5 + F4 + F4 + F3 + F4 + F3 + F3 + F2 = ::: In order to avoid duplicated computation, a table F can be maintained when calculating the Fibonacci numbers. The

2610. rst two elements are

2611. lled as 1, all others are left blank. During the top-down recursive calculation, If need Fk, we

2612. rst look up this table for the k-th cell, if it isn't blank, we use that value directly. Otherwise we need further calculation. Whenever a value is calculated, we store it in the corresponding cell for looking up in the future. 1: F f1; 1;NIL;NIL; :::g 2: function Fibonacci(n) 3: if n 2 ^ F[n] = NIL then 4: F[n] Fibonacci(n 1) + Fibonacci(n 2) 5: return F[n] By using the similar idea, we can develop a new top-down change-making solution. We use a table T to maintain the best changes, it is initialized to all empty coin list. During the top-down recursive computation, we look up this table for smaller changing values. Whenever a intermediate value is calculated, it is stored in the table. 1: T f;; :::g 2: function Change(X) 3: if X 0 ^ T[X] = then 4: for c 2 C do 5: if c X then

2613. 528 CHAPTER 14. SEARCHING 6: Cm fcg[ Change(X c) 7: if T[X] = _ jCmj jT[X]j then 8: T[X] Cm 9: return T[X] The solution to change 0 money is de

2614. nitely empty , otherwise, we look up T[X] to retrieve the solution to change X money. If it is empty, we need recursively calculate it. We examine all coins in the coin system C which is not greater than X. This is the sub problem of making changes for money X c. The minimum amount of coins plus one coin of c is stored in T[X]

2615. nally as the result. The following example Python program implements this algorithm just takes 8000 ms to give the answer of changing 1.42 dollar in US coin system. tab = [[] for _ in range(1000)] def change(x, cs): if x 0 and tab[x] == []: for s in [[c] + change(x - c, cs) for c in cs if c x]: if tab[x] == [] or len(s) len(tab[x]): tab[x] = s return tab[x] Another solution to calculate Fibonacci number, is to compute them in order of F1; F2; F3; :::; Fn. This is quite natural when people write down Fibonacci series. 1: function Fibo(n) 2: F = f1; 1;NIL;NIL; :::g 3: for i 3 to n do 4: F[i] F[i 1] + F[i 2] 5: return F[n] We can use the quite similar idea to solve the change making problem. Starts from zero money, which can be changed by an empty list of coins, we next try to

2616. gure out how to change money of value 1. In US coin system for example, A cent can be used; The next values of 2, 3, and 4, can be changed by two coins of 1 cent, three coins of 1 cent, and 4 coins of 1 cent. At this stage, the solution table looks like below 0 1 2 3 4 f1g f1; 1g f1; 1; 1g f1; 1; 1; 1g The interesting case happens for changing value 5. There are two options, use another coin of 1 cent, which need 5 coins in total; The other way is to use 1 coin of 5 cent, which uses less coins than the former. So the solution table can be extended to this. 0 1 2 3 4 5 f1g f1; 1g f1; 1; 1g f1; 1; 1; 1g f5g For the next change value 6, since there are two types of coin, 1 cent and 5 cent, are less than this value, we need examine both of them. If we choose the 1 cent coin, we need next make changes for 5; Since we've already known that the best solution to change 5 is f5g, which only needs a coin of 5 cents, by looking up the solution table, we have one candidate solution to change 6 as f5; 1g;

2617. 14.3. SOLUTION SEARCHING 529 The other option is to choose the 5 cent coin, we need next make changes for 1; By looking up the solution table we've

2618. lled so far, the sub optimal solution to change 1 is f1g. Thus we get another candidate solution to change 6 as f1; 5g; It happens that, both options yield a solution of two coins, we can select either of them as the best solution. Generally speaking, the candidate with fewest number of coins is selected as the solution, and

2619. lled into the table. At any iteration, when we are trying to change the i X value of money, we examine all the types of coin. For any coin c not greater than i, we look up the solution table to fetch the sub solution T[i c]. The number of coins in this sub solution plus the one coin of c are the total coins needed in this candidate solution. The fewest candidate is then selected and updated to the solution table. The following algorithm realizes this bottom-up idea. 1: function Change(X) 2: T f;; :::g 3: for i 1 to X do 4: for c 2 C; c i do 5: if T[i] = _ 1 + jT[i c]j jT[i]j then 6: T[i] fcg [ T[i c] 7: return T[X] This algorithm can be directly translated to imperative programs, like Python for example. def changemk(x, cs): s = [[] for _ in range(x+1)] for i in range(1, x+1): for c in cs: if c i and (s[i] == [] or 1 + len(s[i-c]) len(s[i])): s[i] = [c] + s[i-c] return s[x] Observe the solution table, it's easy to

2620. nd that, there are many duplicated contents being stored. 6 7 8 9 10 ... f1; 5g f1; 1; 5g f1; 1; 1; 5g f1; 1; 1; 1; 5g f5; 5g ... This is because the optimal sub solutions are completely copied and saved in parent solution. In order to use less space, we can only record the `delta' part from the sub optimal solution. In change-making problem, it means that we only need to record the coin being selected for value i. 1: function Change'(X) 2: T f0;1;1; :::g 3: S fNIL;NIL; :::g 4: for i 1 to X do 5: for c 2 C; c i do 6: if 1 + T[i c] T[i] then 7: T[i] 1 + T[i c] 8: S[i] c 9: while X 0 do 10: Print(S[X])

2621. 530 CHAPTER 14. SEARCHING 11: X X S[X] Instead of recording the complete solution list of coins, this new algorithm uses two tables T and S. T holds the minimum number of coins needed for changing value 0, 1, 2, ...; while S holds the

2622. rst coin being selected for the optimal solution. For the complete coin list to change money X, the

2623. rst coin is thus S[X], the sub optimal solution is to change money X0 = X S[X]. We can look up table S[X0] for the next coin. The coins for sub optimal solutions are repeatedly looked up like this till the beginning of the table. Below Python example program implements this algorithm. def chgmk(x, cs): cnt = [0] + [x+1] x s = [0] for i in range(1, x+1): coin = 0 for c in cs: if c i and 1 + cnt[i-c] cnt[i]: cnt[i] = 1 + cnt[i-c] coin = c s.append(coin) r = [] while x 0: r.append(s[x]) x = x - s[x] return r This change-making solution loops n times for given money n. It examines at most the full coin system in each iteration. The time is bound to (nk) where k is the number of coins for a certain coin system. The last algorithm adds O(n) spaces to record sub optimal solutions with table T and S. In purely functional settings, There is no means to mutate the solution table and look up in constant time. One alternative is to use

2624. nger tree as we mentioned in previous chapter 11. We can store the minimum number of coins, and the coin leads to the sub optimal solution in pairs. The solution table, which is a

2625. nger tree, is initialized as T = f(0; 0)g. It means change 0 money need no coin. We can fold on list f1; 2; :::;Xg, start from this table, with a binary function change(T; i). The folding will build the solution table, and we can construct the coin list from this table by function make(X; T). makeChange(X) = make(X; fold(change; f(0; 0)g; f1; 2; :::;Xg)) (14.94) In function change(T; i), all the coins not greater than i are examined to select the one lead to the best result. The fewest number of coins, and the coin being selected are formed to a pair. This pair is inserted to the

2626. nger tree, so that a new solution table is returned. change(T; i) = insert(T; fold(sel; (1; 0); fcjc 2 C; c ig)) (14.95) 11Some purely functional programming environments, Haskell for instance, provide built-in array; while other almost pure ones, such as ML, provide mutable array

2627. 14.3. SOLUTION SEARCHING 531 Again, folding is used to select the candidate with the minimum number of coins. This folding starts with initial value (1; 0), on all valid coins. function sel((n; c); c0) accepts two arguments, one is a pair of length and coin, which is the best solution so far; the other is a candidate coin, it examines if this candidate can make better solution. 0 ) = sel((n; c); c (1 + n0; c0) : 1 + n0 n; (n0; c0) = T[i c0] (n; c) : otherwise (14.96) After the solution table is built, the coins needed can be generated from it. make(X; T) = : X = 0 fcg [ make(X c; T) : otherwise; (n; c) = T[X] (14.97) The following example Haskell program uses Data.Sequence, which is the library of

2628. nger tree, to implement change making solution. import Data.Sequence (Seq, singleton, index, (j)) changemk x cs = makeChange x $ foldl change (singleton (0, 0)) [1..x] where change tab i = let sel c = min (1 + fst (index tab (i - c)), c) in tab j (foldr sel ((x + 1), 0) $ filter (i) cs) makeChange 0 _ = [] makeChange x tab = let c = snd $ index tab x in c : makeChange (x - c) tab It's necessary to memorize the optimal solution to sub problems no matter using the top-down or the bottom-up approach. This is because a sub problem is used many times when computing the overall optimal solution. Such properties are called overlapping sub problems. Properties of dynamic programming Dynamic programming was originally named by Richard Bellman in 1940s. It is a powerful tool to search for optimal solution for problems with two properties. Optimal sub structure. The problem can be broken down into smaller problems, and the optimal solution can be constructed eciently from solutions of these sub problems; Overlapping sub problems. The problem can be broken down into sub problems which are reused several times in

2629. nding the overall solution. The change-making problem, as we've explained, has both optimal sub struc-tures, and overlapping sub problems. Longest common subsequence problem The longest common subsequence problem, is dierent with the longest com-mon substring problem. We've show how to solve the later in the chapter of sux tree. The longest common subsequence needn't be consecutive part of the original sequence.

2630. 532 CHAPTER 14. SEARCHING Figure 14.49: The longest common subsequence For example, The longest common substring for text Mississippi, and Missunderstanding is Miss, while the longest common subsequence for them are Misssi. This is shown in

2631. gure 14.49. If we rotate the

2632. gure vertically, and consider the two texts as two pieces of source code, it turns to be a `di' result between them. Most modern version control tools need calculate the dierence content among the versions. The longest common subsequence problem plays a very important role. If either one of the two strings X and Y is empty, the longest common subse-quence LCS(X; Y ) is de

2633. nitely empty; Otherwise, denote X = fx1; x2; :::; xng, Y = fy1; y2; :::; ymg, if the

2634. rst elements x1 and y1 are same, we can recursively

2635. nd the longest subsequence of X0 = fx2; x3; :::; xng and Y 0 = fy2; y3; :::; ymg. And the

2636. nal result LCS(X; Y ) can be constructed by concatenating x1 with LCS(X0; Y 0); Otherwise if x16= y1, we need recursively

2637. nd the longest com-mon subsequences of LCS(X; Y 0) and LCS(X0; Y ), and pick the longer one as the

2638. nal result. Summarize these cases gives the below de

2639. nition. LCS(X; Y ) = 8 : : X = _ Y = fx1g [ LCS(X0; Y 0) : x1 = y1 longer(LCS(X; Y 0);LCS(X0; Y )) : otherwise (14.98) Note that this algorithm shows clearly the optimal substructure, that the longest common subsequence problem can be broken to smaller problems. The sub problem is ensured to be at least one element shorter than the original one. It's also clear that, there are overlapping sub-problems. The longest common subsequences to the sub strings are used multiple times in

2640. nding the overall optimal solution. The existence of these two properties, the optimal substructure and the overlapping sub-problem, indicates the dynamic programming can be used to

2641. 14.3. SOLUTION SEARCHING 533 solve this problem. A 2-dimension table can be used to record the solutions to the sub-problems. The rows and columns represent the substrings of X and Y respectively. a n t e n n a 1 2 3 4 5 6 7 b 1 a 2 n 3 a 4 n 5 a 6 This table shows an example of

2642. nding the longest common subsequence for strings antenna and banana. Their lengths are 7, and 6. The right bottom corner of this table is looked up

2643. rst, Since it's empty we need compare the 7th element in antenna and the 6th in banana, they are both `a', Thus we need next recursively look up the cell at row 5, column 6; It's still empty, and we repeated this till either get a trivial case that one substring becomes empty, or some cell we are looking up has been

2644. lled before. Similar to the change-making problem, whenever the optimal solution for a sub-problem is found, it is recorded in the cell for further reusing. Note that this process is in the reversed order comparing to the recursive equation given above, that we start from the right most element of each string. Considering that the longest common subsequence for any empty string is still empty, we can extended the solution table so that the

2645. rst row and column hold the empty strings. a n t e n n a b a n a n a Below algorithm realizes the top-down recursive dynamic programming so-lution with such a table. 1: T NIL 2: function LCS(X; Y ) 3: m jXj; n jY j 4: m0 m + 1; n0 n + 1 5: if T = NIL then 6: T ff;; :::;g; f;NIL;NIL; :::g; :::g . m0 n0 7: if X6= ^ Y 6= ^ T[m0][n0] = NIL then 8: if X[m] = Y [n] then 9: T[m0][n0] Append(LCS(X[1::m 1]; Y [1::n 1]), X[m]) 10: else 11: T[m0][n0] Longer(LCS(X; Y [1::n1]), LCS(X[1::m1]; Y )) 12: return T[m0][n0] The table is

2646. rstly initialized with the

2647. rst row and column

2648. lled with empty

2649. 534 CHAPTER 14. SEARCHING strings; the rest are all NIL values. Unless either string is empty, or the cell content isn't NIL, the last two elements of the strings are compared, and recur-sively computes the longest common subsequence with substrings. The following Python example program implements this algorithm. def lcs(xs, ys): m = len(xs) n = len(ys) global tab if tab is None: tab = [[](n+1)] + [[] + [None]n for _ in xrange(m)] if m != 0 and n !=0 and tab[m][n] is None: if xs[-1] == ys[-1]: tab[m][n] = lcs(xs[:-1], ys[:-1]) + xs[-1] else: (a, b) = (lcs(xs, ys[:-1]), lcs(xs[:-1], ys)) tab[m][n] = a if len(b) len(a) else b return tab[m][n] The longest common subsequence can also be found in a bottom-up manner as what we've done with the change-making problem. Besides that, instead of recording the whole sequences in the table, we can just store the lengths of the longest subsequences, and later construct the subsubsequence with this table and the two strings. This time, the table is initialized with all values set as 0. 1: function LCS(X; Y ) 2: m jXj; n jY j 3: T ff0; 0; :::g; f0; 0; :::g; :::g . (m + 1) (n + 1) 4: for i 1 to m do 5: for j 1 to n do 6: if X[i] = Y [j] then 7: T[i + 1][j + 1] T[i][j] + 1 8: else 9: T[i + 1][j + 1] Max(T[i][j + 1]; T[i + 1][j]) 10: return Get(T;X; Y; m; n) 11: function Get(T;X; Y; i; j) 12: if i = 0 _ j = 0 then 13: return 14: else if X[i] = Y [j] then 15: return Append(Get(T;X; Y; i 1; j 1), X[i]) 16: else if T[i 1][j] T[i][j 1] then 17: return Get(T;X; Y; i 1; j) 18: else 19: return Get(T;X; Y; i; j 1) In the bottom-up approach, we start from the cell at the second row and the second column. The cell is corresponding to the

2650. rst element in both X, and Y . If they are same, the length of the longest common subsequence so far is 1. This can be yielded by increasing the length of empty sequence, which is stored in the top-left cell, by one; Otherwise, we pick the maximum value from the upper cell and left cell. The table is repeatedly

2651. lled in this manner. After that, a back-track is performed to construct the longest common sub-

2652. 14.3. SOLUTION SEARCHING 535 sequence. This time we start from the bottom-right corner of the table. If the last elements in X and Y are same, we put this element as the last one of the result, and go on looking up the cell along the diagonal line; Otherwise, we compare the values in the left cell and the right cell, and go on looking up the cell with the bigger value. The following example Python program implements this algorithm. def lcs(xs, ys): m = len(xs) n = len(ys) c = [[0](n+1) for _ in xrange(m+1)] for i in xrange(1, m+1): for j in xrange(1, n+1): if xs[i-1] == ys[j-1]: c[i][j] = c[i-1][j-1] + 1 else: c[i][j] = max(c[i-1][j], c[i][j-1]) return get(c, xs, ys, m, n) def get(c, xs, ys, i, j): if i==0 or j==0: return [] elif xs[i-1] == ys[j-1]: return get(c, xs, ys, i-1, j-1) + [xs[i-1]] elif c[i-1][j] c[i][j-1]: return get(c, xs, ys, i-1, j) else: return get(c, xs, ys, i, j-1) The bottom-up dynamic programming solution can also be de

2653. ned in purely functional way. The

2654. nger tree can be used as a table. The

2655. rst row is

2656. lled with n+1 zero values. This table can be built by folding on sequence X. Then the longest common subsequence is constructed from the table. LCS(X; Y ) = construct(fold(f; ff0; 0; :::; 0gg; zip(f1; 2; :::g;X))) (14.99) Note that, since the table need be looked up by index, X is zipped with natural numbers. Function f creates a new row of this table by folding on sequence Y , and records the lengths of the longest common sequence for all possible cases so far. f(T; (i; x)) = insert(T; fold(longest; f0g; zip(f1; 2; :::g; Y ))) (14.100) Function longest takes the intermediate

2657. lled row result, and a pair of index and element in Y , it compares if this element is the same as the one in X. Then

2658. lls the new cell with the length of the longest one. longest(R; (j; y)) = insert(R; 1 + T[i 1][j 1]) : x = y insert(R; max(T[i 1][j]; T[i][j 1])) : otherwise (14.101)

2659. 536 CHAPTER 14. SEARCHING After the table is built. The longest common sub sequence can be con-s tructed recursively by looking up this table. We can pass the reversed sequences X, and Y together with their lengths m and n for ecient building. X;m); ( construct(T) = get(( Y ; n)) (14.102) If the sequences are not empty, denote the

2660. rst elements as x and y. The rest elements are hold in X0 and Y 0 respectively. The function get can be de

2661. ned as the following. X; i); ( get(( Y ; j)) = 8 : : X = ^ Y = X0; i 1); ( get(( Y 0; j 1)) [ fxg : x = y X0; i 1); ( get(( Y ; j)) : T[i 1][j] T[i][j 1] X; i); ( get(( Y 0; j 1)) : otherwise (14.103) Below Haskell example program implements this solution. lcs' xs ys = construct $ foldl f (singleton $ fromList $ replicate (n+1) 0) (zip [1..] xs) where (m, n) = (length xs, length ys) f tab (i, x) = tab j (foldl longer (singleton 0) (zip [1..] ys)) where longer r (j, y) = r j if x == y then 1 + (tab ìndex` (i-1) ìndex` (j-1)) else max (tab ìndex` (i-1) ìndex` j) (r ìndex` (j-1)) construct tab = get (reverse xs, m) (reverse ys, n) where get ([], 0) ([], 0) = [] get ((x:xs), i) ((y:ys), j) j x == y = get (xs, i-1) (ys, j-1) ++ [x] j (tab ìndex` (i-1) ìndex` j) (tab ìndex` i ìndex` (j-1)) = get (xs, i-1) ((y:ys), j) j otherwise = get ((x:xs), i) (ys, j-1) Subset sum problem Dynamic programming does not limit to solve the optimization problem, but can also solve some more general searching problems. Subset sum problem is such an example. Given a set of integers, is there a non-empty subset sums to zero? for example, there are two subsets of f11; 64;82;68; 86; 55;88;21; 51g both sum to zero. One is f64;82; 55;88; 51g, the other is f64;82;68; 86g. Of course summing to zero is a special case, because sometimes, people want to

2662. nd a subset, whose sum is a given value s. Here we are going to develop a method to

2663. nd all the candidate subsets. There is obvious a brute-force exhausting search solution. For every element, we can either pick it or not. So there are total 2n options for set with n elements. Because for every selection, we need check if it sums to s. This is a linear operation. The overall complexity is bound to O(n2n). This is the exponential algorithm, which takes very huge time if the set is big. There is a recursive solution to subset sum problem. If the set is empty, there is no solution de

2664. nitely; Otherwise, let the set is X = fx1; x2; :::g. If x1 = s, then subset fx1g is a solution, we need next search for subsets X0 = fx2; x3; :::g for

2665. 14.3. SOLUTION SEARCHING 537 those sum to s; Otherwise if x16= s, there are two dierent kinds of possibilities. We need search X0 for both sum s, and sum s x1. For any subset sum to s x1, we can add x1 to it to form a new set as a solution. The following equation de

2666. nes this algorithm. solve(X; s) = 8 : : X = ffx1gg [ solve(X0; s) : x1 = s solve(X0; s) [ ffx1g [ SjS 2 solve(X0; s x1)g : otherwise (14.104) There are clear substructures in this de

2667. nition, although they are not in a sense of optimal. And there are also overlapping sub-problems. This indicates the problem can be solved with dynamic programming with a table to memorize the solutions to sub-problems. Instead of developing a solution to output all the subsets directly, let's con-sider how to give the existence answer

2668. rstly. That output 'yes' if there exists some subset sum to s, and 'no' otherwise. One fact is that, the upper and lower limit for all possible answer can be calculated in one scan. If the given sum s doesn't belong to this range, there is no solution obviously. sl = Pfx 2 X; x 0g su = P fx 2 X; x 0g (14.105) Otherwise, if sl s su, since the values are all integers, we can use a table, with su sl + 1 columns, each column represents a possible value in this range, from sl to su. The value of the cell is either true or false to represents if there exists subset sum to this value. All cells are initialized as false. Starts from the

2669. rst element x1 in X, de

2670. nitely, set fx1g can sum to x1, so that the cell represents this value in the

2671. rst row can be

2672. lled as true. sl sl + 1 ... x1 ... su x1 F F ... T ... F With the next element x2, There are three possible sums. Similar as the

2673. rst row, fx2g sums to x2; For all possible sums in previous row, they can also been achieved without x2. So the cell below to x1 should also be

2674. lled as true; By adding x2 to all possible sums so far, we can also get some new values. That the cell represents x1 + x2 should be true. sl sl + 1 ... x1 ... x2 ... x1 + x2 ... su x1 F F ... T ... F ... F ... F x2 F F ... T ... T ... T ... F Generally speaking, when

2675. ll the i-th row, all the possible sums constructed with fx1; x2; :::; xi1 so far can also be achieved with xi. So the cells previously are true should also be true in this new row. The cell represents value xi should also be true since the singleton set fxig sums to it. And we can also adds xi to all previously constructed sums to get the new results. Cells represent these new sums should also be

2676. lled as true. When all the elements are processed like this, a table with jXj rows is built. Looking up the cell represents s in the last row tells if there exists subset can sum to this value. As mentioned above, there is no solution if s sl or su s. We skip handling this case for the sake of brevity. 1: function Subset-Sum(X; s)

2677. 538 CHAPTER 14. SEARCHING 2: sl Pfx 2 X; x 0g 3: su P fx 2 X; x 0g 4: n jXj 5: T ffFalse; False; :::g; fFalse; False; :::g; :::g . n (su sl + 1) 6: for i 1 to n do 7: for j sl to su do 8: if X[i] = j then 9: T[i][j] True 10: if i 1 then 11: T[i][j] T[i][j] _ T[i 1][j] 12: j0 j X[i] 13: if sl j0 su then 14: T[i][j] T[i][j] _ T[i 1][j0] 15: return T[n][s] Note that the index to the columns of the table, doesn't range from 1 to su sl+1, but maps directly from sl to su. Because most programming environments don't support negative index, this can be dealt with T[i][j sl]. The following example Python program utilizes the property of negative indexing. def solve(xs, s): low = sum([x for x in xs if x 0]) up = sum([x for x in xs if x 0]) tab = [[False](up-low+1) for _ in xs] for i in xrange(0, len(xs)): for j in xrange(low, up+1): tab[i][j] = (xs[i] == j) j1 = j - xs[i]; tab[i][j] = tab[i][j] or tab[i-1][j] or (low j1 and j1 up and tab[i-1][j1]) return tab[-1][s] Note that this program doesn't use dierent branches for i = 0 and i = 1; 2; :::; n 1. This is because when i = 0, the row index to i 1 = 1 refers to the last row in the table, which are all false. This simpli

2678. es the logic one more step. With this table built, it's easy to construct all subsets sum to s. The method is to look up the last row for cell represents s. If the last element xn = s, then fxng de

2679. nitely is a candidate. We next look up the previous row for s, and recursively construct all the possible subsets sum to s with fx1; x2; x3; :::; xn1g. Finally, we look up the second last row for cell represents sxn. And for every subset sums to this value, we add element xn to construct a new subset, which sums to s. 1: function Get(X; s; T; n) 2: S 3: if X[n] = s then 4: S S [ fX[n]g 5: if n 1 then 6: if T[n 1][s] then 7: S S[ Get(X; s; T; n 1) 8: if T[n 1][s X[n]] then

2680. 14.3. SOLUTION SEARCHING 539 9: S S [ ffX[n]g [ S0jS0 2 Get(X; s X[n]; T; n 1) g 10: return S The following Python example program translates this algorithm. def get(xs, s, tab, n): r = [] if xs[n] == s: r.append([xs[n]]) if n 0: if tab[n-1][s]: r = r + get(xs, s, tab, n-1) if tab[n-1][s - xs[n]]: r = r + [[xs[n]] + ys for ys in get(xs, s - xs[n], tab, n-1)] return r This dynamic programming solution to subset sum problem loops O(n(su sl+1)) times to build the table, and recursively uses O(n) time to construct the

2681. nal solution from this table. The space it used is also bound to O(n(susl+1)). Instead of using table with n rows, a vector can be used alternatively. For every cell represents a possible sum, the list of subsets are stored. This vector is initialized to contain all empty sets. For every element in X, we update the vector, so that it records all the possible sums which can be built so far. When all the elements are considered, the cell corresponding to s contains the

2682. nal result. 1: function Subset-Sum(X; s) 2: sl Pfx 2 X; x 0g 3: su Pfx 2 X; x 0g 4: T f;; :::g . su sl + 1 5: for x 2 X do 6: T0 Duplicate(T) 7: for j sl to su do 8: j0 j x 9: if x = j then 10: T0[j] T0[j] [ fxg 11: if sl j0 su ^ T[j0]6= then 12: T0[j] T0[j] [ ffxg [ SjS 2 T[j0]g 13: T T0 14: return T[s] The corresponding Python example program is given as below. def subsetsum(xs, s): low = sum([x for x in xs if x 0]) up = sum([x for x in xs if x 0]) tab = [[] for _ in xrange(low, up+1)] for x in xs: tab1 = tab[:] for j in xrange(low, up+1): if x == j: tab1[j].append([x]) j1 = j - x if low j1 and j1 up and tab[j1] != []: tab1[j] = tab1[j] + [[x] + ys for ys in tab[j1]]

2683. 540 CHAPTER 14. SEARCHING tab = tab1 return tab[s] This imperative algorithm shows a clear structure, that the solution table is built by looping every element. This can be realized in purely functional way by folding. A

2684. nger tree can be used to represents the vector spans from sl to su. It is initialized with all empty values as in the following equation. subsetsum(X; s) = fold(build; f;; :::; g;X)[s] (14.106) After folding, the solution table is built, the answer is looked up at cell s12. For every element x 2 X, function build folds the list fsl; sl+1; :::; sug, with every value j, it checks if it equals to x and appends the singleton set fxg to the j-th cell. Not that here the cell is indexed from sl, but not 0. If the cell corresponding to jx is not empty, the candidate solutions stored in that place are also duplicated and add element x is added to every solution. build(T; x) = fold(f; T; fsl; sl + 1; :::; sug) (14.107) f(T; j) = T0[j] [ ffxg [ Y jY 2 T[j0]g : sl j0 su ^ T[j0]6= ; j0 = j x T0 : otherwise (14.108) Here the adjustment is applied on T0, which is another adjustment to T as shown as below. T 0 = fxg [ T[j] : x = j T : otherwise (14.109) Note that the

2685. rst clause in both equation (14.108) and (14.109) return a new table with certain cell being updated with the given value. The following Haskell example program implements this algorithm. subsetsum xs s = foldl build (fromList [[] j _ [l..u]]) xs ìdx` s where l = sum $ filter ( 0) xs u = sum $ filter ( 0) xs idx t i = index t (i - l) build tab x = foldl (t j ! let j' = j - x in adjustIf (l j' j' u tab ìdx` j' == []) (++ [(x:ys) j ys tab ìdx` j']) j (adjustIf (x == j) ([x]:) j t)) tab [l..u] adjustIf pred f i seq = if pred then adjust f (i - l) seq else seq Some materials like [16] provide common structures to abstract dynamic pro-gramming. So that problems can be solved with a generic solution by customiz-ing the precondition, the comparison of candidate solutions for better choice, and the merge method for sub solutions. However, the variety of problems makes things complex in practice. It's important to study the properties of the problem carefully. Exercise 14.3 12Again, here we skip the error handling to the case that s sl or s su. There is no solution if s is out of range.

2686. 14.4. SHORT SUMMARY 541 Realize a maze solver by using the stack approach, which can

2687. nd all the possible paths. There are 92 distinct solutions for the 8 queens puzzle. For any one so-lution, rotating it 90; 180; 270 gives solutions too. Also ipping it ver-tically and horizontally also generate solutions. Some solutions are sym-metric, so that rotation or ip gives the same one. There are 12 unique solutions in this sense. Modify the program to

2688. nd the 12 unique solu-tions. Improve the program, so that the 92 distinct solutions can be found with fewer search. Make the 8 queens puzzle solution generic so that it can solve n queens puzzle. Make the functional solution to the leap frogs puzzle generic, so that it can solve n frogs case. Modify the wolf, goat, and cabbage puzzle algorithm, so that it can

2689. nd all possible solutions. Give the complete algorithm de

2690. nition to solve the 2 water jugs puzzle with extended Euclid algorithm. We needn't the exact linear combination information x and y in fact. After we know the puzzle is solvable by testing with GCD, we can blindly execute the process that:

2691. ll A, pour A into B, whenever B is full, empty it till there is expected volume in one jug. Realize this solution. Can this one

2692. nd faster solution than the original version? Compare to the extended Euclid method, the DFS approach is a kind of brute-force searching. Improve the extended Euclid approach by

2693. nding the best linear combination which minimize jxj + jyj. Realize the imperative Human code table generating algorithm. One option to realize the bottom-up solution for the longest common subsequence problem is to record the direction in the table. Thus, instead of storing the length information, three values like 'N', for north, 'W' for west, and 'NW' for northwest are used to indicate how to construct the

2694. nal result. We start from the bottom-right corner of the table, if the cell value is 'NW', we go along the diagonal by moving to the cell in the upper-left; if it's 'N', we move vertically to the upper row; and move horizontally if it's 'W'. Implement this approach in your favorite programming language. 14.4 Short summary This chapter introduces the elementary methods about searching. Some of them instruct the computer to scan for interesting information among the data. They often have some structure, that can be updated during the scan. This can be considered as a special case for the information reusing approach. The other commonly used strategy is divide and conquer, that the scale of the search

2695. 542 CHAPTER 14. SEARCHING domain is kept decreasing till some obvious result. This chapter also explains methods to search for solutions among domains. The solutions typically are not the elements being searched. They can be a series of decisions or some operation arrangement. If there are multiple solutions, sometimes, people want to

2696. nd the optimized one. For some spacial cases, there exist simpli

2697. ed approach such as the greedy methods. And dynamic programming can be used for more wide range of problems when they shows optimal substructures.

2698. Bibliography [1] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [3] M. Blum, R.W. Floyd, V. Pratt, R. Rivest and R. Tarjan, Time bounds for selection, J. Comput. System Sci. 7 (1973) 448-461. [4] Jon Bentley. Programming pearls, Second Edition. Addison-Wesley Pro-fessional; 1999. ISBN-13: 978-0201657883 [5] Richard Bird. Pearls of functional algorithm design. Chapter 3. Cam-bridge University Press. 2010. ISBN, 1139490605, 9781139490603 [6] Edsger W. Dijkstra. The saddleback search. EWD-934. 1985. http://www.cs.utexas.edu/users/EWD/index09xx.html. [7] Robert Boyer, and Strother Moore. MJRTY - A Fast Majority Vote Al-gorithm. Automated Reasoning: Essays in Honor of Woody Bledsoe, Au-tomated Reasoning Series, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1991, pp. 105-117. [8] Cormode, Graham; S. Muthukrishnan (2004). An Improved Data Stream Summary: The Count-Min Sketch and its Applications. J. Algorithms 55: 29C38. [9] Knuth Donald, Morris James H., jr, Pratt Vaughan. Fast pattern matching in strings. SIAM Journal on Computing 6 (2): 323C350. 1977. [10] Robert Boyer, Strother Moore. A Fast String Searching Algorithm. Comm. ACM (New York, NY, USA: Association for Computing Machin-ery) 20 (10): 762C772. 1977 [11] R. N. Horspool. Practical fast searching in strings. Software - Practice Experience 10 (6): 501C506. 1980. [12] Wikipedia. Boyer-Moore string search algorithm. http://en.wikipedia.org/wiki/Boyer-Moore string search algorithm [13] Wikipedia. Eight queens puzzle. http://en.wikipedia.org/wiki/Eight queens puzzle 543

2699. 544 BIBLIOGRAPHY [14] George Polya. How to solve it: A new aspect of mathematical method. Princeton University Press(April 25, 2004). ISBN-13: 978-0691119663 [15] Wikipedia. David A. Human. http://en.wikipedia.org/wiki/David A. Human [16] Fethi Rabhi, Guy Lapalme Algorithms: a functional programming ap-proach. Second edition. Addison-Wesley.

2700. Part VI Appendix 545

2702. Appendix A Lists A.1 Introduction This book intensely uses recursive list manipulations in purely functional set-tings. List can be treated as a counterpart to array in imperative settings, which are the bricks to many algorithms and data structures. For the readers who are not familiar with functional list manipulation, this appendix provides a quick reference. All operations listed in this appendix are not only described in equations, but also implemented in both functional programming languages as well as imperative languages as examples. We also provide a special type of implementation in C++ template meta programming similar to [3] for interesting in next appendix. Besides the elementary list operations, this appendix also contains explana-tion of some high order function concepts such as mapping, folding etc. A.2 List De

2703. nition Like arrays in imperative settings, lists play a critical role in functional setting1. Lists are built-in support in some programming languages like Lisp families and ML families so it needn't explicitly de

2704. ne list in those environment. List, or more precisely, singly linked-list is a data structure that can be described below. A list is either empty; Or contains an element and a list. Note that this de

2705. nition is recursive. Figure A.1 illustrates a list with N nodes. Each node contains two part, a key element and a sub list. The sub list contained in the last node is empty, which is denoted as 'NIL'. 1Some reader may argue that `lambda calculus plays the most critical role'. Lambda calculus is somewhat as assembly languages to the computation world, which is worthy studying from the essence of computation model to the practical programs. However, we don't dive into the topic in this book. Users can refer to [4] for detail. 547

2706. 548 APPENDIX A. LISTS key[1] next key[2] next ... key[N] NIL Figure A.1: A list contains N nodes This data structure can be explicitly de

2707. ned in programming languages sup-port record (or compound type) concept. The following ISO C++ code de

2708. nes list2. templatetypename T struct List { T key; List next; }; A.2.1 Empty list It is worth to mention about 'empty' list a bit more in detail. In environment supporting the nil concept, for example, C or java like programming languages, empty list can have two dierent representations. One is the trivial `NIL' (or null, or 0, which varies from languages); the other is an non-NIL empty list as fg, the latter is typically allocated with memory but

2709. lled with nothing. In Lisp dialects, the empty is commonly written as '(). In ML families, it's written as []. We use to denote empty list in equations and use 'NIL' in pseudo code sometimes to describe algorithms in this book. A.2.2 Access the element and the sub list Given a list L, two functions can be de

2710. ned to access the element stored in it and the sub list respectively. They are typically denoted as first(L), and rest(L) or head(L) and tail(L) for the same meaning. These two functions are named as car and cdr in Lisp for historic reason about the design of machine 2We only use template to parameterize the type of the element in this chapter. Except this point, all imperative source code are in ANSI C style to avoid language speci

2711. c features.

2712. A.3. BASIC LIST MANIPULATION 549 registers [5]. In languages support Pattern matching (e.g. ML families, Prolog and Erlang etc.) These two functions are commonly realized by matching the cons which we'll introduced later. for example the following Haskell program: head (x:xs) = x tail (x:xs) = xs If the list is de

2713. ned in record syntax like what we did above, these two functions can be realized by accessing the record

2714. elds 3. templatetypename T T first(ListT xs) { return xs!key; } templatetypename T ListT rest(ListT xs) { return xs!next; } In this book, L0 is used to denote the rest(L) sometimes, also we uses l1 to represent first(L) in the context that the list is literately given in form L = fl1; l2; :::; lNg. More interesting, as far as in an environment support recursion, we can de

2715. ne List. The following example de

2716. ne a list of integers in C++ compile time. struct Empty; templateint x, typename T struct List { static const int first = x; typedef T rest; }; This line constructs a list of f1; 2; 3; 4; 5g in compile time. typedef List1, List2, List3, List4 List5, Empty A; A.3 Basic list manipulation A.3.1 Construction The last C++ template meta programming example actually shows literate construction of a list. A list can be constructed from an element with a sub list, where the sub list can be empty. We denote function cons(x;L) as the constructor. This name is used in most Lisp dialects. In ML families, there are `cons' operator de

2717. ned as ::, (in Haskell it's :). We can de

2718. ne cons to create a record as we de

2719. ned above in ISO C++, for example4. templatetypename T ListT cons(T x, ListT xs) { ListT lst = new ListT; lst!key = x; lst!next = xs; return lst; } 3They can be also named as 'key' and 'next' or be de

2720. ned as class methods. 4 It is often de

2721. ned as a constructor method for the class template, However, we de

2722. ne it as a standalone function for illustration purpose.

2723. 550 APPENDIX A. LISTS A.3.2 Empty testing and length calculating It is trivial to test if a list is empty. If the environment contains nil concept, the testing should also handle nil case. Both Lisp dialects and ML families provide null testing functions. Empty testing can also be realized by pattern-matching with empty list if possible. The following Haskell program shows such example. null [] = True null _ = False In this book we will either use empty(L) or L = where empty testing happens. With empty testing de

2724. ned, it's possible to calculate length for a list. In imperative settings, Length is often implemented like the following. function Length(L) n 0 while L6= NIL do n n + 1 L Next(L) This ISO C++ code translates the algorithm to real program. templatetypename T int length(ListT xs) { int n = 0; for (; xs; ++n, xs = xs!next); return n } However, in purely funcitonal setting, we can't mutate a counter variable. the idea is that, if the list is empty, then its size is zero; otherwise, we can recursively calculate the length of the sub list, then add it by one to get the length of this list. length(L) = 0 : L = 1 + length(L0) : otherwise (A.1) Here L0 = rest(L) as mentioned above, it's fl2; l3; :::; lNg for list contains N elements. Note that both L and L0 can be empty . In this equation, we also use '=' to test if list L is empty. In order to know the length of a list, we need traverse all the elements from the head to the end, so that this algorithm is proportion to the number of elements stored in the list. It is a linear algorithm bound to O(N) time. Below are two programs in Haskell and in Scheme/Lisp realize this recursive algorithm. length [] = 0 length (x:xs) = 1 + length xs (define (length lst) (if (null? lst) 0 (+ 1 (length (cdr lst))))) How to testing if two lists are identical is left as exercise to the reader.

2725. A.3. BASIC LIST MANIPULATION 551 A.3.3 indexing One big dierence between array and list (singly-linked list accurately) is that array supports random access. Many programming languages support using x[i] to access the i-th element stored in array in constant O(1) time. The index typically starts from 0, but it's not the all case. Some programming languages using 1 as the

2726. rst index. In this appendix, we treat index starting from 0. However, we must traverse the list with i steps to reach the target element. The traversing is quite similar to the length calculation. Thus it's commonly expressed as below in imperative settings. function Get-At(L; i) while i6= 0 do L Next(L) return First(L) Note that this algorithm doesn't handle the error case such that the index isn't within the bound of the list. We assume that 0 i jLj, where jLj = length(L). The error handling is left as exercise to the reader. The following ISO C++ code is a line-by-line translation of this algorithm. templatetypename T T getAt(ListT lst, int n) { while(n--) lst = lst!next; return lst!key; } However, in purely functional settings, we turn to recursive traversing in-stead of while-loop. getAt(L; i) = First(L) : i = 0 getAt(Rest(L); i 1) : otherwise (A.2) In order to get the i-th element, the algorithm does the following: if i is 0, then we are done, the result is the

2727. rst element in the list; Otherwise, the result is to get the (i 1)-th element from the sub-list. This algorithm can be translated to the following Haskell code. getAt i (x:xs) = if i == 0 then x else getAt i-1 xs Note that we are using pattern matching to ensure the list isn't empty, which actually handles all out-of-bound cases with un-matched pattern error. Thus if i jLj, we

2728. nally arrive at a edge case that the index is ijLj, while the list is empty; On the other hand, if i 0, minus it by one makes it even farther away from 0. We

2729. nally end at the same error that the index is some negative, while the list is empty; The indexing algorithm takes time proportion to the value of index, which is bound to O(N) linear time. This section only address the read semantics. How to mutate the element at a given position is explained in later section.

2730. 552 APPENDIX A. LISTS A.3.4 Access the last element Although accessing the

2731. rst element and the rest list L0 is trivial, the opposite operations, that retrieving the last element and the initial sub list need linear time without using a tail pointer. If the list isn't empty, we need traverse it till the tail to get these two components. Below are their imperative descriptions. function Last(L) x NIL while L6= NIL do x First(L) L Rest(L) return x function Init(L) L0 NIL while Rest(L)6= NIL do L0 Append(L0, First(L)) L Rest(L) return L0 The algorithm assumes that the input list isn't empty, so the error handling is skipped. Note that the Init() algorithm uses the appending algorithm which will be de

2732. ned later. Below are the corresponding ISO C++ implementation. The optimized ver-sion by utilizing tail pointer is left as exercise. templatetypename T T last(ListT xs) { T x; = Can be set to a special value to indicate empty list err. = for (; xs; xs = xs!next) x = xs!key; return x; } templatetypename T ListT init(ListT xs) { ListT ys = NULL; for (; xs!next; xs = xs!next) ys = append(ys, xs!key); return ys; } While these two algorithms can be implemented in purely recursive manner as well. When we want to access the last element. If the list contains only one element (the rest sub-list is empty), the result is this very element; Otherwise, the result is the last element of the rest sub-list. last(L) = First(L) : Rest(L) = last(Rest(L)) : otherwise (A.3)

2733. A.3. BASIC LIST MANIPULATION 553 The similar approach can be used to get a list contains all elements except for the last one. The edge case: If the list contains only one element, then the result is an empty list; Otherwise, we can

2734. rst get a list contains all elements except for the last one from the rest sub-list, then construct the

2735. nal result from the

2736. rst element and this intermediate result. init(L) = : L0 = cons(l1; init(L0)) : otherwise (A.4) Here we denote l1 as the

2737. rst element of L, and L0 is the rest sub-list. This recursive algorithm needn't use appending, It actually construct the

2738. nal result list from right to left. We'll introduce a high-level concept of such kind of computation later in this appendix. Below are Haskell programs implement last() and init() algorithms by using pattern matching. last [x] = x last (_:xs) = last xs init [x] = [] init (x:xs) = x : init xs Where [x] matches the singleton list contains only one element, while (_:xs) matches any non-empty list, and the underscore (_) is used to indicate that we don't care about the element. For the detail of pattern matching, readers can refer to any Haskell tutorial materials, such as [8]. A.3.5 Reverse indexing Reverse indexing is a general case for last(),

2739. nding the i-th element in a singly-linked list with the minimized memory spaces is interesting, and this problem is often used in technical interview in some companies. A naive implementation takes 2 rounds of traversing, the

2740. rst round is to determine the length of the list N, then, calculate the left-hand index by N i1. Finally a second round of traverse is used to access the element with the left-hand index. This idea can be give as the following equation. getAtR(L; i) = getAt(L; length(L) i 1) There exists better imperative solution. For illustration purpose, we omit the error cases such as index is out-of-bound etc. The idea is to keep two pointers p1; p2, with the distance of i between them, that resti(p2) = p1, where resti(p1) means repleatedly apply rest() function i times. It says that succeeds i steps from p2 gets p1. We can start p2 from the head of the list and advance the two pointers in parallel till one of them (p1) arrives at the end of the list. At that time point, pointer p2 exactly arrived at the i-th element from right. Figure A.2 illustrates this idea. It is straightforward to realize the imperative algorithm based on this `double pointers' solution.

2741. 554 APPENDIX A. LISTS p2 p1 x[1] x[2] ... x[i+1] ... x[N] . (a) p2 starts from the head, which is behind p1 in i steps. p2 p1 x[1] x[2] ... x[N-i] ... x[N] . (b) When p1 reaches the end, p2 points to the i-th element from right. Figure A.2: Double pointers solution to reverse indexing. function Get-At-R(L; i) p L while i6= 0 do L Rest(L) i i 1 while Rest(L)6= NIL do L Rest(L) p Rest(p) return First(p) The following ISO C++ code implements the `double pointers' right indexing algorithm. templatetypename T T getAtR(ListT xs, int i) { ListT p = xs; while(i--) xs = xs!next; for(; xs!next; xs = xs!next, p = p!next); return p!key; } The same idea can be realized recursively as well. If we want to access the i-th element of list L, we can examine the two lists L and S = fli; li+1; :::; lNg simultaneously, where S is a sub-list of L without the

2742. rst i elements. The edge case: If S is a singleton list, then the i-th element from right is the

2743. rst element in L; Otherwise, we drop the

2744. rst element from L and S, and recursively exam-

2745. A.3. BASIC LIST MANIPULATION 555 ine L0 and S0. This algorithm description can be formalized as the following equations. getAtR(L; i) = examine(L; drop(i;L)) (A.5) Where function examine(L; S) is de

2746. ned as below. examine(L; S) = first(L) : jSj = 1 examine(rest(L); rest(S)) : otherwise (A.6) We'll explain the detail of drop() function in later section about list mutating operations. Here it can be implemented as repeatedly call rest() with speci

2747. ed times. drop(n;L) = L : n = 0 drop(n 1; rest(L)) : otherwise Translating the equations to Haskell yields this example program. atR :: [a] ! Int ! a atR xs i = get xs (drop i xs) where get (x:_) [_] = x get (_:xs) (_:ys) = get xs ys drop n as@(_:as') = if n == 0 then as else drop (n-1) as' Here we use dummy variable _ as the placeholders for components we don't care. A.3.6 Mutating Strictly speaking, we can't mutate the list at all in purely functional settings. Unlike in imperative settings, mutate is actually realized by creating new list. Almost all functional environments support garbage collection, the original list may either be persisted for reusing, or released (dropped) at sometime (Chapter 2 in [6]). Appending Function cons can be viewed as building list by insertion element always on head. If we chains multiple cons operations, it can repeatedly construct a list from right to the left. Appending on the other hand, is an operation adding element to the tail. Compare to cons which is trivial constant time O(1) operation, We must traverse the whole list to locate the appending position. It means that appending is bound to O(N), where N is the length of the list. In order to speed up the appending, imperative implementation typically uses a

2748. eld (variable) to record the tail position of a list, so that the traversing can be avoided. However, in purely functional settings we can't use such `tail' pointer. The appending has to be realized in recursive manner. append(L; x) = fxg : L = cons(first(L); append(rest(L); x)) : otherwise (A.7) That the algorithm handles two dierent appending cases:

2749. 556 APPENDIX A. LISTS If the list is empty, the result is a singleton list contains x, which is the element to be appended. The singleton list notion fxg = cons(x; ), is a simpli

2750. ed form of cons the element with an empty list ; Otherwise, for the none-empty list, the result can be achieved by

2751. rst ap-pending the element x to the rest sub-list, then construct the

2752. rst element of L with the recursive appending result. For the none-trivial case, if we denote L = fl1; l2; :::g, and L0 = fl2; l3; :::g the equation can be written as. append(L; x) = fxg : L = cons(l1; append(L0; x)) : otherwise (A.8) We'll use both forms in the rest of this appendix. The following Scheme/Lisp program implements this algorithm. (define (append lst x) (if (null? lst) (list x) (cons (car lst) (append (cdr lst) x)))) Even without the tail pointer, it's possible to traverse the list imperatively and append the element at the end. function Append(L; x) if L = NIL then return Cons(x, NIL) H L while Rest(L)6= NIL do L Rest(L) Rest(L) Cons(x, NIL) return H The following ISO C++ programs implements this algorithm. How to uti-lize a tail

2753. eld to speed up the appending is left as exercise to the reader for interesting. templatetypename T ListT append(ListT xs, T x) { ListT tail, head; for (head = tail = xs; xs; xs = xs!next) tail = xs; if (!head) head = consT(x, NULL); else tail!next = consT(x, NULL); return head; } Mutate element at a given position Although we have de

2754. ned random access algorithm getAt(L; i), we can't just mutate the element returned by this function in a sense of purely functional

2755. A.3. BASIC LIST MANIPULATION 557 settings. It is quite common to provide reference semantics in imperative pro-gramming languages and in some `almost' functional environment. Readers can refer to [4] for detail. For example, the following ISO C++ example returns a reference instead of a value in indexing program. templatetypename T T getAt(ListT xs, int n) { while (n--) xs = xs!next; return xs!key; } So that we can use this function to mutate the 2nd element as below. Listint xs = cons(1, cons(2, consint(3, NULL))); getAt(xs, 1) = 4; In an impure functional environment, such as Scheme/Lisp, to set the i- th element to a given value can be implemented by mutate the referenced cell directly as well. (define (set-at! lst i x) (if (= i 0) (set-car! lst x) (set-at! (cdr lst) (- i 1) x))) This program

2756. rst checks if the index i is zero, if so, it mutate the

2757. rst element of the list to given value x; otherwise, it deduces the index i by one, and tries to mutate the rest of the list at this new index with value x. This function doesn't return meaningful value. It is for use of side-eect. For instance, the following code mutates the 2nd element in a list. (define lst '(1 2 3 4 5)) (set-at! lst 1 4) (display lst) (1 4 3 4 5) In order to realize a purely functional setAt(L; i; x) algorithm, we need avoid directly mutating the cell, but creating a new one: Edge case: If we want to set the value of the

2758. rst element (i = 0), we construct a new list, with the new value and the sub-list of the previous one; Otherwise, we construct a new list, with the previous

2759. rst element, and a new sub-list, which has the (i 1)-th element set with the new value. This recursive description can be formalized by the following equation. setAt(L; i; x) = cons(x;L0) : i = 0 cons(l1; setAt(L0; i 1; x)) : otherwise (A.9) Comparing the below Scheme/Lisp implementation to the previous one re-veals the dierence from imperative mutating.

2760. 558 APPENDIX A. LISTS (define (set-at lst i x) (if (= i 0) (cons x (cdr lst)) (cons (car lst) (set-at (cdr lst) (- i 1) x)))) Here we skip the error handling for out-of-bound error etc. Again, similar to the random access algorithm, the performance is bound to linear time, as traverse is need to locate the position to set the value. insertion There are two semantics about list insertion. One is to insert an element at a given position, which can be denoted as insert(L; i; x). The algorithm is close to setAt(L; i; x); The other is to insert an element to a sorted list, so that the the result list is still sorted. Let's

2761. rst consider how to insert an element x at a given position i. The obvious thing is that we need

2762. rstly traverse i elements to get to the position, the rest of work is to construct a new sub-list with x being the head of this sub-list. Finally, we construct the whole result by attaching this new sub-list to the end of the

2763. rst i elements. The algorithm can be described accordingly to this idea. If we want to insert an element x to a list L at i. Edge case: If i is zero, then the insertion turns to be a trivial `cons' operation { cons(x;L); Otherwise, we recursively insert x to the sub-list L0 at position i1; then construct the

2764. rst element with this result. Below equation formalizes the insertion algorithm. insert(L; i; x) = cons(x;L) : i = 0 cons(l1; insert(L0; i 1; x)) : otherwise (A.10) The following Haskell program implements this algorithm. insert xs 0 y = y:xs insert (x:xs) i y = x : insert xs (i-1) y This algorithm doesn't handle the out-of-bound error. However, we can interpret the case, that the position i exceeds the length of the list as appending. Readers can considering about it in the exercise of this section. The algorithm can also be designed imperatively: If the position is zero, just construct the new list with the element to be inserted as the

2765. rst one; Otherwise, we record the head of the list, then start traversing the list i steps. We also need an extra variable to memorize the previous position for the later list insertion operation. Below is the pseudo code. function Insert(L; i; x) if i = 0 then return Cons(x;L) H L p L

2766. A.3. BASIC LIST MANIPULATION 559 while i6= 0 do p L L Rest(L) i i 1 Rest(p) Cons(x;L) return H And the ISO C++ example program is given by translating this algorithm. templatetypename T ListT insert(ListT xs, int i , int x) { ListT head, prev; if (i == 0) return cons(x, xs); for (head = xs; i; --i, xs = xs!next) prev = xs; prev!next = cons(x, xs); return head; } If the list L is sorted, that is for any position 1 i j N, we have li lj . We can design an algorithm which inserts a new element x to the list, so that the result list is still sorted. insert(x;L) = 8 : cons(x; ) : L = cons(x;L) : x l1 cons(l1; insert(x;L0)) : otherwise (A.11) The idea is that, to insert an element x to a sorted list L: If either L is empty or x is less than the

2767. rst element in L, we just put x in front of L to construct the result; Otherwise, we recursively insert x to the sub-list L0. The following Haskell program implements this algorithm. Note that we use , to determine the ordering. Actually this constraint can be loosened to the strict less (), that if elements can be compare in terms of , we can design a program to insert element so that the result list is still sorted. Readers can refer to the chapters of sorting in this book for details about ordering. insert y [] = [y] insert y xs@(x:xs') = if y x then y : xs else x : insert y xs' Since the algorithm need compare the elements one by one, it's also a linear time algorithm. Note that here we use the 'as' notion for pattern matching in Haskell. Readers can refer to [8] and [7] for details. This ordered insertion algorithm can be designed in imperative manner, for example like the following pseudo code5. function Insert(x;L) if L = _ x First(L) then return Cons(x;L) 5Reader can refer to the chapter `The evolution of insertion sort' in this book for a minor dierent one

2768. 560 APPENDIX A. LISTS H L while Rest(L)6= ^ First(Rest(L)) x do L Rest(L) Rest(L) Cons(x, Rest(L)) return H If either the list is empty, or the new element to be inserted is less than the

2769. rst element in the list, we can just put this element as the new

2770. rst one; Otherwise, we record the head, then traverse the list till a position, where x is less than the rest of the sub-list, and put x in that position. Compare this one to the `insert at' algorithm shown previously, the variable p uses to point to the previous position during traversing is omitted by examine the sub-list instead of current list. The following ISO C++ program implements this algorithm. templatetypename T ListT insert(T x, ListT xs) { ListT head; if (!xs j j x xs!key) return cons(x, xs); for (head = xs; xs!next xs!next!key x; xs = xs!next); xs!next = cons(x, xs!next); return head; } With this linear time ordered insertion de

2771. ned, it's possible to implement quadratic time insertion-sort by repeatedly inserting elements to an empty list as formalized in this equation. sort(L) = : L = insert(l1; sort(L0)) : otherwise (A.12) This equation says that if the list to be sorted is empty, the result is also empty, otherwise, we can

2772. rstly recursively sort all elements except for the

2773. rst one, then ordered insert the

2774. rst element to this intermediate result. The corresponding Haskell program is given as below. isort [] = [] isort (x:xs) = insert x (isort xs) And the imperative linked-list base insertion sort is described in the follow-ing. That we initialize the result list as empty, then take the element one by one from the list to be sorted, and ordered insert them to the result list. function Sort(L) L0 while L6= do L0 Insert(First(L), L0) L Rest(L) return L0 Note that, at any time during the loop, the result list is kept sorted. There is a major dierence between the recursive algorithm (formalized by the equation) and the procedural one (described by the pseudo code), that the former process the list from right, while the latter from left. We'll see in later section about `tail-recursion' how to eliminate this dierence. The ISO C++ version of linked-list insertion sort is list like this.

2775. A.3. BASIC LIST MANIPULATION 561 templatetypename T ListT isort(ListT xs) { ListT ys = NULL; for(; xs; xs = xs!next) ys = insert(xs!key, ys); return ys; } There is also a dedicated chapter discusses insertion sort in this book. Please refer to that chapter for more details including performance analysis and

2776. ne-tuning. deletion In purely functional settings, there is no deletion at all in terms of mutating, the data is persist, what the semantic deletion means is actually to create a 'new' list with all the elements in previous one except for the element being 'deleted'. Similar to the insertion, there are also two deletion semantics. One is to delete the element at a given position; the other is to

2777. nd and delete elements of a given value. The

2778. rst can be expressed as delete(L; i), while the second is delete(L; x). In order to design the algorithm delete(L; i) (or `delete at'), we can use the idea which is quite similar to random access and insertion, that we

2779. rst traverse the list to the speci

2780. ed position, then construct the result list with the elements we have traversed, and all the others except for the next one we haven't traversed yet. The strategy can be realized in a recursive manner that in order to delete the i-th element from list L, If i is zero, that we are going to delete the

2781. rst element of a list, the result is obviously the rest of the list; If the list to be removed element is empty, the result is anyway empty; Otherwise, we can recursively delete the (i 1)-th element from the sub- list L0, then construct the

2782. nal result from the

2783. rst element of L and this intermediate result. Note there are two edge cases, and the second case is major used for error handling. This algorithm can be formalized with the following equation. delete(L; i) = 8 : L0 : i = 0 : L = cons(l1; delete(L0; i 1)) : (A.13) Where L0 = rest(L); l1 = first(L). The corresponding Haskell example program is given below. del (_:xs) 0 = xs del [] _ = [] del (x:xs) i = x : del xs (i-1) This is a linear time algorithm as well, and there are also alternatives for implementation, for example, we can

2784. rst split the list at position i 1, to get 2 sub-lists L1 and L2, then we can concatenate L1 and L0 2.

2785. 562 APPENDIX A. LISTS The 'delete at' algorithm can also be realized imperatively, that we traverse to the position by looping: function Delete(L; i) if i = 0 then return Rest(L) H L p L while i6= 0 do i i 1 p L L Rest(L) Rest(p) Rest(L) return H Dierent from the recursive approach, The error handling for out-of-bound is skipped. Besides that the algorithm also skips the handling of resource releasing which is necessary in environments without GC (Garbage collection). Below ISO C++ code for example, explicitly releases the node to be deleted. templatetypename T ListT del(ListT xs, int i) { ListT head, prev; if (i == 0) head = xs!next; else { for (head = xs; i; --i, xs = xs!next) prev = xs; prev!next = xs!next; } xs!next = NULL; delete xs; return head; } Note that the statement xs-next = NULL is neccessary if the destructor is designed to release the whole linked-list recursively. The `

2786. nd and delete' semantic can further be represented in two ways, one is just

2787. nd the

2788. rst occurrence of a given value, and delete this element from the list; The other is to

2789. nd ALL occurrence of this value, and delete these elements. The later is more general case, and it can be achieved by a minor modi

2790. cation of the former. We left the `

2791. nd all and delete' algorithm as an exercise to the reader. The algorithm can be designed exactly as the term '

2792. nd and delete' but not '

2793. nd then delete', that the

2794. nding and deleting are processed in one pass traversing. If the list to be dealt with is empty, the result is obviously empty; If the list isn't empty, we examine the

2795. rst element of the list, if it is identical to the given value, the result is the sub list; Otherwise, we keep the

2796. rst element, and recursively

2797. nd and delete the element in the sub list with the given value. The

2798. nal result is a list constructed with the kept

2799. rst element, and the recursive deleting result.

2800. A.3. BASIC LIST MANIPULATION 563 This algorithm can be formalized by the following equation. delete(L; x) = 8 : : L = L0 : l1 = x cons(l1; delete(L0; x)) : otherwise (A.14) This algorithm is bound to linear time as it traverses the list to

2801. nd and delete element. Translating this equation to Haskell program yields the below code, note that, the

2802. rst edge case is handled by pattern-matching the empty list; while the other two cases are further processed by if-else expression. del [] _ = [] del (x:xs) y = if x == y then xs else x : del xs y Dierent from the above imperative algorithms, which skip the error han-dling in most cases, the imperative `

2803. nd and delete' realization must deal with the problem that the given value doesn't exist. function Delete(L; x) if L = then . Empty list return if First(L) = x then H Rest(L) else H L while L6= ^ First(L)6= x do . List isn't empty p L L Rest(L) if L6= then . Found Rest(p) Rest(L) return H If the list is empty, the result is anyway empty; otherwise, the algorithm traverses the list till either

2804. nds an element identical to the given value or to the end of the list. If the element is found, it is removed from the list. The following ISO C++ program implements the algorithm. Note that there are codes release the memory explicitly. templatetypename T ListT del(ListT xs, T x) { ListT head, prev; if (!xs) return xs; if (xs!key == x) head = xs!next; else { for (head = xs; xs xs!key != x; xs = xs!next) prev = xs; if (xs) prev!next = xs!next; } if (xs) { xs!next = NULL; delete xs; }

2805. 564 APPENDIX A. LISTS return head; } concatenate Concatenation can be considered as a general case for appending, that append-ing only adds one more extra element to the end of the list, while concatenation adds multiple elements. However, It will lead to quadratic algorithm if implement concatenation naively by appending, which performs poor. Consider the following equation. concat(L1;L2) = L1 : L2 = concat(append(L1; first(L2)); rest(L2)) : otherwise Note that each appending algorithm need traverse to the end of the list, which is proportion to the length of L1, and we need do this linear time ap-pending work jL2j times, so the total performance is O(jL1j + (jL1j + 1) + ::: + (jL1j + jL2j)) = O(jL1jjL2j + jL2j2). The key point is that the linking operation of linked-list is fast (constant O(1) time), we can traverse to the end of L1 only one time, and link the second list to the tail of L1. concat(L1;L2) = L2 : L1 = cons(first(L1); concat(rest(L1);L2)) : otherwise (A.15) This algorithm only traverses the

2806. rst list one time to get the tail of L1, and then linking the second list with this tail. So the algorithm is bound to linear O(jL1j) time. This algorithm is described as the following. If the

2807. rst list is empty, the concatenate result is the second list; Otherwise, we concatenate the second list to the sub-list of the

2808. rst one, and construct the

2809. nal result with the

2810. rst element and this intermediate result. Most functional languages provide built-in functions or operators for list concatenation, for example in ML families ++ is used for this purpose. [] ++ ys = ys xs ++ [] = xs (x:xs) ++ ys = x : xs ++ ys Note we add another edge case that if the second list is empty, we needn't traverse to the end of the

2811. rst one and perform linking, the result is merely the

2812. rst list. In imperative settings, concatenation can be realized in constant O(1) time with the augmented tail record. We skip the detailed implementation for this method, reader can refer to the source code which can be download along with this appendix. The imperative algorithm without using augmented tail record can be de-scribed as below.

2813. A.3. BASIC LIST MANIPULATION 565 function Concat(L1;L2) if L1 = then return L2 if L2 = then return L1 H L1 while Rest(L1)6= do L1 Rest(L1) Rest(L1) L2 return H And the corresponding ISO C++ example code is given like this. templatetypename T ListT concat(ListT xs, ListT ys) { ListT head; if (!xs) return ys; if (!ys) return xs; for (head = xs; xs!next; xs = xs!next); xs!next = ys; return head; } A.3.7 sum and product Recursive sum and product It is common to calculate the sum or product of a list of numbers. They are quite similar in terms of algorithm structure. We'll see how to abstract such structure in later section. In order to calculate the sum of a list: If the list is empty, the result is zero; Otherwise, the result is the

2814. rst element plus the sum of the rest of the list. Formalize this description gives the following equation. sum(L) = 0 : L = l1 + sum(L0) : otherwise (A.16) However, we can't merely replace plus to times in this equation to achieve product algorithm, because it always returns zero. We can de

2815. ne the product of empty list as 1 to solve this problem. product(L) = 1 : L = l1 product(L0) : otherwise (A.17) The following Haskell program implements sum and product.

2816. 566 APPENDIX A. LISTS sum [] = 0 sum (x:xs) = x + sum xs product [] = 1 product (x:xs) = x product xs Both algorithms traverse the whole list during calculation, so they are bound to O(N) linear time. Tail call recursion Note that both sum and product algorithms actually compute the result from right to left. We can change them to the normal way, that calculate the accumulated result from left to right. For example with sum, the result is actually accumulated from 0, and adds element one by one to this accumulated result till all the list is consumed. Such approach can be described as the following. When accumulate result of a list by summing: If the list is empty, we are done and return the accumulated result; Otherwise, we take the

2817. rst element from the list, accumulate it to the result by summing, and go on processing the rest of the list. Formalize this idea to equation yields another version of sum algorithm. sum 0 (A;L) = A : L = sum0(A + l1;L0) : otherwise (A.18) And sum can be implemented by calling this function by passing start value 0 and the list as arguments. sum(L) = sum 0 (0;L) (A.19) The interesting point of this approach is that, besides it calculates the result in a normal order from left to right; by observing the equation of sum0(A;L), we found it needn't remember any intermediate results or states when perform recursion. All such states are either passed as arguments (A for example) or can be dropped (previous elements of the list for example). So in a practical implementation, such kind of recursive function can be optimized by eliminating the recursion at all. We call such kind of function as `tail recursion' (or `tail call'), and the op-timization of removing recursion in this case as 'tail recursion optimization'[10] because the recursion happens as the

2818. nal action in such function. The ad-vantage of tail recursion optimization is that the performance can be greatly improved, so that we can avoid the issue of stack over ow in deep recursion algorithms such as sum and product. Changing the sum and product Haskell programs to tail-recursion manner gives the following modi

2819. ed programs. sum = sum' 0 where sum' acc [] = acc sum' acc (x:xs) = sum' (acc + x) xs product = product' 1 where product' acc [] = acc product' acc (x:xs) = product' (acc x) xs

2820. A.3. BASIC LIST MANIPULATION 567 In previous section about insertion sort, we mentioned that the functional version sorts the elements form right, this can also be modi

2821. ed to tail recursive realization. sort 0 (A;L) = A : L = sort0(insert(l1;A);L0) : otherwise (A.20) The the sorting algorithm is just calling this function by passing empty list as the accumulator argument. 0 (;L) (A.21) sort(L) = sort Implementing this tail recursive algorithm to real program is left as exercise to the reader. As the end of this sub-section, let's consider an interesting problem, that how to design an algorithm to compute bN eectively? (refer to problem 1.16 in [5].) A naive brute-force solution is to repeatedly multiply b for N times from 1, which leads to a linear O(N) algorithm. function Pow(b;N) x 1 loopN times x x b return x Actually, the solution can be greatly improved. Consider we are trying to calculate b8. By the

2822. rst 2 iterations in above naive algorithm, we got x = b2. At this stage, we needn't multiply x with b to get b3, we can directly calculate x2, which leads to b4. And if we do this again, we get (b4)2 = b8. Thus we only need looping 3 times but not 8 times. An algorithm based on this idea to compute bN if N = 2M for some non-negative integer M can be shown in the following equation. pow(b;N) = b : N = 1 pow(b; N 2 )2 : otherwise It is easy to extend this divide and conquer algorithm so that N can be any non-negative integer. For the trivial case, that N is zero, the result is 1; If N is even number, we can halve N, and compute b N 2

2823. rst. Then calculate the square number of this result. Otherwise, N is odd. Since N 1 is even, we can

2824. rst recursively compute bN1, the multiply b one more time to this result. Below equation formalizes this description. pow(b;N) = 8 : 1 : N = 0 2 )2 : 2jN pow(b; N b pow(b;N 1) : otherwise (A.22)

2825. 568 APPENDIX A. LISTS However, it's hard to turn this algorithm to be tail-recursive mainly because of the 2nd clause. In fact, the 2nd clause can be alternatively realized by squaring the base number, and halve the exponent. pow(b;N) = 8 : 1 : N = 0 2 ) : 2jN pow(b2; N b pow(b;N 1) : otherwise (A.23) With this change, it's easy to get a tail-recursive algorithm as the following, so that bN = pow0(b;N; 1). pow 0 (b;N;A) = 8 : A : N = 0 pow0(b2; N 2 ;A) : 2jN pow0(b;N 1;A b) : otherwise (A.24) Compare to the naive brute-force algorithm, we improved the performance to O(lgN). Actually, this algorithm can be improved even one more step. Observe that if we represent N in binary format N = (amam1:::a1a0)2, we clear know that the computation for b2i is necessary if ai = 1. This is quite similar to the idea of Binomial heap (reader can refer to the chapter of binomial heap in this book). Thus we can calculate the

2826. nal result by multiplying all of them for bits with value 1. For instance, when we compute b11, as 11 = (1011)2 = 23 + 2 + 1, thus b11 = b23 b2 b. We can get the result by the following steps. 1. calculate b1, which is b; 2. Get b2 from previous result; 3. Get b22 from step 2; 4. Get b23 from step 3. Finally, we multiply the results of step 1, 2, and 4 which yields b11. Summarize this idea, we can improve the algorithm as below. pow 0 (b;N;A) = 8 : A : N = 0 pow0(b2; N 2 ;A) : 2jN pow0(b2; bN 2 c;A b) : otherwise (A.25) This algorithm essentially shift N to right for 1 bit each time (by dividing N by 2). If the LSB (Least Signi

2827. cant Bit, which is the lowest bit) is 0, it means N is even. It goes on computing the square of the base, without accumulating the

2828. nal product (Just like the 3rd step in above example); If the LSB is 1, it means N is odd. It squares the base and accumulates it to the product A; The edge case is when N is zero, which means we exhaust all the bits in N, thus the

2829. nal result is the accumulator A. At any time, the updated base number b0, the shifted exponent number N0, and the accumulator A satisfy the invariant that bN = b0N 0 A. This algorithm can be implemented in Haskell like the following.

2830. A.3. BASIC LIST MANIPULATION 569 pow b n = pow' b n 1 where pow' b n acc j n == 0 = acc j even n = pow' (bb) (n `div` 2) acc j otherwise = pow' (bb) (n `div` 2) (accb) Compare to previous algorithm, which minus N by one to change it to even when N is odd, this one halves N every time. It exactly runs m rounds, where m is the number of bits of N. However, the performance is still bound to O(lgN). How to implement this algorithm imperatively is left as exercise to the reader. Imperative sum and product The imperative sum and product are just applying plus and times while travers-ing the list. function Sum(L) s 0 while L6= do s s+ First(L) L Rest(L) return s function Product(L) p 1 while L6= do p p First(L) L Rest(L) return p The corresponding ISO C++ example programs are list as the following. templatetypename T T sum(ListT xs) { T s; for (s = 0; xs; xs = xs!next) s += xs!key; return s; } templatetypename T T product(ListT xs) { T p; for (p = 1; xs; xs = xs!next) p = xs!key; return p; } One interesting usage of product algorithm is that we can calculate factorial of N by calculating the product of f1; 2; :::;Ng that N! = product([1::N]). A.3.8 maximum and minimum Another very useful use case is to get the minimum or maximum element of a list. We'll see that their algorithm structures are quite similar again. We'll

2831. 570 APPENDIX A. LISTS generalize this kind of feature and introduce about higher level abstraction in later section. For both maximum and minimum algorithms, we assume that the given list isn't empty. In order to

2832. nd the minimum element in a list. If the list contains only one element, (a singleton list), the minimum ele-ment is this one; Otherwise, we can

2833. rstly

2834. nd the minimum element of the rest list, then compare the

2835. rst element with this intermediate result to determine the

2836. nal minimum value. This algorithm can be formalized by the following equation. min(L) = 8 : l1 : L = fl1g l1 : l1 min(L0) min(L0) : otherwise (A.26) In order to get the maximum element instead of the minimum one, we can simply replace the comparison to in the above equation. max(L) = 8 : l1 : L = fl1g l1 : l1 max(L0) max(L0) : otherwise (A.27) Note that both maximum and minimum actually process the list from right to left. It remind us about tail recursion. We can modify them so that the list is processed from left to right. What's more, the tail recursion version brings us `on-line' algorithm, that at any time, we hold the minimum or maximum result of the list we examined so far. 0 (L; a) = min 8 : a : L = min(L0; l1) : l1 a min(L0; a) : otherwise (A.28) 0 (L; a) = max 8 : a : L = max(L0; l1) : a l1 max(L0; a) : otherwise (A.29) Dierent from the tail recursion sum and product, we can't pass constant value to min0, or max0 in practice, this is because we have to pass in

2837. nity (min(L;1)) or negative in

2838. nity (max(L;1)) in theory, but in a real machine neither of them can be represented since the length of word is limited. Actually, there is workaround, we can instead pass the

2839. rst element of the list, so that the algorithms become applicable. min(L) = min(L0; l1) max(L) = max(L0; l1) (A.30) The corresponding real programs are given as the following. We skip the none tail recursion programs, as they are intuitive enough. Reader can take them as exercises for interesting.

2840. A.3. BASIC LIST MANIPULATION 571 min (x:xs) = min' xs x where min' [] a = a min' (x:xs) a = if x a then min' xs x else min' xs a max (x:xs) = max' xs x where max' [] a = a max' (x:xs) a = if a x then max' xs x else max' xs a The tail call version can be easily translated to imperative min/max algo-rithms. function Min(L) m First(L) L Rest(L) while L6= do if First(L) m then m First(L) L Rest(L) return m function Max(L) m First(L) L Rest(L) while L6= do if m First(L) then m First(L) L Rest(L) return m The corresponding ISO C++ programs are given as below. templatetypename T T min(ListT xs) { T x; for (x = xs!key; xs; xs = xs!next) if (xs!key x) x = xs!key; return x; } templatetypename T T max(ListT xs) { T x; for (x = xs!key; xs; xs = xs!next) if (x xs!key) x = xs!key; return x; } Another method to achieve tail-call maximum( and minimum) algorithm is by discarding the smaller element each time. The edge case is as same as before; for recursion case, since there are at least two elements in the list, we can take the

2841. rst two for comparing, then drop one and go on process the rest. For a list with more than two elements, denote L00 as rest(rest(L)) = fl3; l4; :::g, we have

2842. 572 APPENDIX A. LISTS the following equation. max(L) = 8 : l1 : jLj = 1 max(cons(l1;L00)) : l2 l1 max(L0) : otherwise (A.31) min(L) = 8 : l1 : jLj = 1 min(cons(l1;L00)) : l1 l2 min(L0) : otherwise (A.32) The relative example Haskell programs are given as below. min [x] = x min (x:y:xs) = if x y then min (x:xs) else min (y:xs) max [x] = x max (x:y:xs) = if x y then max (y:ys) else max (x:xs) Exercise A.1 Given two lists L1 and L2, design a algorithm eq(L1;L2) to test if they are equal to each other. Here equality means the lengths are same, and at the same time, every elements in both lists are identical. Consider varies of options to handle the out-of-bound error case when randomly access the element in list. Realize them in both imperative and functional programming languages. Compare the solutions based on exception and error code. Augment the list with a `tail'

2843. eld, so that the appending algorithm can be realized in constant O(1) time but not linear O(N) time. Feel free to choose your favorite imperative programming language. Please don't refer to the example source code along with this book before you try it. With `tail'

2844. eld augmented to list, for which list operations this

2845. eld must be updated? How it aects the performance? Handle the out-of-bound case in insertion algorithm by treating it as ap-pending. Write the insertion sort algorithm by only using less than (). Design and implement the algorithm that

2846. nd all the occurrence of a given value and delete them from the list. Reimplenent the algorithm to calculate the length of a list in tail-call recursion manner. Implement the insertion sort in tail recursive manner. Implement the O(lgN) algorithm to calculate bN in your favorite imper-ative programming language. Note that we only need accumulate the intermediate result when the bit is not zero.

2847. A.4. TRANSFORMATION 573 A.4 Transformation In previous section, we list some basic operations for linked-list. In this section, we focus on the transformation algorithms for list. Some of them are corner stones of abstraction for functional programming. We'll show how to use list transformation to solve some interesting problems. A.4.1 mapping and for-each It is every-day programming routine that, we need output something as readable string. If we have a list of numbers, and we want to print the list to console like '3 1 2 5 4'. One option is to convert the numbers to strings, so that we can feed them to the printing function. One such trivial conversion program may like this. toStr(L) = : L = cons(str(l1); toStr(L0)) : otherwise (A.33) The other example is that we have a dictionary which is actually a list of words grouped in their initial letters, for example: [[a, an, another, ... ], [bat, bath, bool, bus, ...], ..., [zero, zoo, ...]]. We want to know the frequency of them in English, so that we process some English text, for example, `Hamlet' or the 'Bible' and augment each of the word with a number of occurrence in these texts. Now we have a list like this: [[(a, 1041), (an, 432), (another, 802), ... ], [(bat, 5), (bath, 34), (bool, 11), (bus, 0), ...], ..., [(zero 12), (zoo, 0), ...]] If we want to

2848. nd which word in each initial is used most, how to write a program to work this problem out? The output is a list of words that every one has the most occurrence in the group, which is categorized by initial, something like `[a, but, can, ...]'. We actually need a program which can transfer a list of group of augmented words into a list of words. Let's work it out step by step. First, we need de

2849. ne a function, which takes a list of word - number pairs, and

2850. nd the word has the biggest number aug-mented. Sorting is overkill. What we need is just a special max0() function, Note that the max() function developed in previous section can't be used directly. Suppose for a pair of values p = (a; b), function fst(p) = a, and snd(p) = b are accessors to extract the values, max0() can be de

2851. ned as the following. 0 (L) = max 8 : l1 : jLj = 1 l1 : snd(max0(L0)) snd(l1) max0(L0) : otherwise (A.34) Alternatively, we can de

2852. ne a dedicated function to compare word-number of occurrence pair, and generalize the max() function by passing a compare function. less(p1; p2) = snd(p1) snd(p2) (A.35)

2853. 574 APPENDIX A. LISTS maxBy(cmp;L) = 8 : l1 : jLj = 1 l1 : cmp(l1;maxBy(cmp;L0)) maxBy(cmp;L0) : otherwise (A.36) Then max0() is just a special case of maxBy() with the compare function comparing on the second value in a pair. 0 (L) = maxBy(:less;L) (A.37) max Here we write all functions in purely recursive way, they can be modi

2854. ed in tail call manner. This is left as exercise to the reader. With max0() function de

2855. ned, it's possible to complete the solution by pro-cessing the whole list. solve(L) = : L = cons(fst(max0(l1)); solve(L0)) : otherwise (A.38) Map Compare the solve() function in (A.38) and toStr() function in (A.33), it re-veals very similar algorithm structure. although they targets on very dierent problems, and one is trivial while the other is a bit complex. The structure of toStr() applies the function str() which can turn a number into string on every element in the list; while solve()

2856. rst applies max0() function to every element (which is actually a list of pairs), then applies fst() function, which essentially turns a list of pairs into a string. It is not hard to abstract such common structure like the following equation, which is called as mapping. map(f;L) = : L = cons(f(l1)); map(f;L0)) : otherwise (A.39) Because map takes a `converter' function f as argument, it's called a kind of high-order function. In functional programming environment such as Haskell, mapping can be implemented just like the above equation. map :: (a!b)![a]![b] map _ [] = [] map f (x:xs) = f x : map f xs The two concrete cases we discussed above can all be represented in high order mapping. toStr = map str solve = map (fst max0) Where f g means function composing, that we

2857. rst apply g then apply f. For instance function h(x) = f(g(x)) can be represented as h = f g, reading like function h is composed by f and g. Note that we use Curried form to omit the argument L for brevity. Informally speaking, If we feed a function which needs 2 arguments, for instance f(x; y) = z with only 1 argument, the result turns to be a function which need 1 argument. For instance, if we feed f with

2858. A.4. TRANSFORMATION 575 only argument x, it turns to be a new function take one argument y, de

2859. ned as g(y) = f(x; y), or g = fx. Note that x isn't a free variable any more, as it is bound to a value. Reader can refer to any book about functional programming for details about function composing and Currying. Mapping can also be understood from the domain theory point of view. Consider function y = f(x), it actually de

2860. nes a mapping from domain of variable x to the domain of value y. (x and y can have dierent types). If the domains can be represented as set X, and Y , we have the following relation. Y = ff(x)jx 2 Xg (A.40) This type of set de

2861. nition is called Zermelo Frankel set abstraction (as known as ZF expression) [7]. The dierent is that here the mapping is from a list to another list, so there can be duplicated elements. In languages support list comprehension, for example Haskell and Python etc (Note that the Python list is a built-in type, but not the linked-list we discussed in this appendix), mapping can be implemented as a special case of list comprehension. map f xs = [ f x j x xs] List comprehension is a powerful tool. Here is another example that realizes rN the permutation algorithm in list comprehension. Many textbooks introduce how to implement all-permutation for a list, such as [7], and [9]. It is possible to design a more general version perm(L; r), that if the length of the list L is N, this algorithm permutes r elements from the total N elements. We know that there are P= N! (Nr)! solutions. perm(L; r) = fg : r = 0 _ jLj r fflg [ Pjl 2 L; P 2 perm(L flg; r 1)g : otherwise (A.41) In this equation, flg [ P means cons(l; P), and Lflg denotes delete(L; l), which is de

2862. ned in previous section. If we take zero element for permutation, or there are too few elements (less than r), the result is a list contains a empty list; Otherwise for non-trivial case, the algorithm picks one element l from the list, and recursively permutes the rest N 1 elements by picking up r 1 ones; then it puts all the possible l in front of all the possible r 1 permutations. Here is the Haskell implementation of this algorithm. perm _ 0 = [[]] perm xs r j length xs r = [[]] j otherwise = [ x:ys j x xs, ys perm (delete x xs) (r-1)] We'll go back to the list comprehension later in section about

2863. ltering. Mapping can also be realized imperatively. We can apply the function while traversing the list, and construct the new list from left to right. Since that the new element is appended to the result list, we can track the tail position to achieve constant time appending, so the mapping algorithms is linear in terms of the passed in function. function Map(f;L) L0 p while L6= do

2864. 576 APPENDIX A. LISTS if p = then p Cons(f( First(L) ); ) L0 p else Next(p) Cons(f( First(L) ); ) p Next(p) L Next(L) return L0 Because It is a bit complex to annotate the type of the passed-in function in ISO C++, as it involves some detailed language speci

2865. c features. See [11] for detail. In fact ISO C++ provides the very same mapping concept as in std::transform. However, it needs the reader have knowledge of function object, iterator etc, which are out of the scope of this book. Reader can refer to any ISO C++ STL materials for detail. For brevity purpose, we switch to Python programming language for example code. So that the type inference can be avoid in compile time. The de

2866. nition of a simple singly linked-list in Python is give as the following. class List: def __init__(self, x = None, xs = None): self.key = x self.next = xs def cons(x, xs): return List(x, xs) The mapping program, takes a function and a linked-list, and maps the functions to every element as described in above algorithm. def mapL(f, xs): ys = prev = List() while xs is not None: prev.next = List(f(xs.key)) prev = prev.next xs = xs.next return ys.next Dierent from the pseudo code, this program uses a dummy node as the head of the resulting list. So it needn't test if the variable stores the last appending position is NIL. This small trick makes the program compact. We only need drop the dummy node before returning the result. For each For the trivial task such as printing a list of elements out, it's quite OK to just print each element without converting the whole list to a list of strings. We can actually simplify the program. function Print(L) while L6= do print First(L) L Rest(L) More generally, we can pass a procedure such as printing, to this list traverse, so the procedure is performed for each element.

2867. A.4. TRANSFORMATION 577 function For-Each(L; P) while L6= do P(First(L)) L Rest(L) For-each algorithm can be formalized in recursive approach as well. foreach(L; p) = u : L = do(p(l1); foreach(L0; p)) : otherwise (A.42) Here u means unit, it's can be understood as doing nothing, The type of it is similar to the `void' concept in C or java like programming languages. The do() function evaluates all its arguments, discards all the results except for the last one, and returns the last result as the

2868. nal value of do(). It is equivalent to (begin ...) in Lisp families, and do block in Haskell in some sense. For the details about unit type, please refer to [4]. Note that the for-each algorithm is just a simpli

2869. ed mapping, there are only two minor dierence points: It needn't form a result list, we care the `side eect' rather than the returned value; For each focus more on traversing, while mapping focus more on applying function, thus the order of arguments are typically arranged as map(f;L) and foreach(L; p). Some Functional programming facilities provides options for both returning the result list or discarding it. For example Haskell Monad library provides both mapM, mapM_ and forM, forM_. Readers can refer to language speci

2870. c materials for detail. Examples for mapping We'll show how to use mapping by an example, which is a problem of ACM/ICPC[12]. For sake of brevity, we modi

2871. ed the problem description a bit. Suppose there are N lights in a room, all of them are o. We execute the following process N times: 1. We switch all the lights in the room, so that they are all on; 2. We switch the 2, 4, 6, ... lights, that every other light is switched, if the light is on, it will be o, and it will be on if the previous state is o; 3. We switch every third lights, that the 3, 6, 9, ... are switched; 4. ... And at the last round, only the last light (the N-th light) is switched. The question is how many lights are on

2872. nally? Before we show the best answer to this puzzle, let's

2873. rst work out a naive brute-force solution. Suppose there are N lights, which can be represented as a list of 0, 1 numbers, where 0 means the light is o, and 1 means on. The initial state is a list of N zeros: f0; 0; :::; 0g.

2874. 578 APPENDIX A. LISTS We can label the light from 1 to N. A mapping can help us to turn the above list into a labeled list6. map(i (i; 0); f1; 2; 3; :::Ng) This mapping augments each natural number with zero, the result is a list of pairs: L = f(1; 0); (2; 0); :::; (N; 0)g. Next we operate this list of pairs N times from 1 to N. For every time i, we switch the second value in this pair if the

2875. rst label can be divided by i. Consider the fact that 10 = 1, and 11 = 0, we can realize switching of 0, 1 value x by 1x. At the i-th operation, for light (j; x), if ijj, (or j mod i = 0), we then perform switching, otherwise, we leave the light untouched. switch(i; (j; x)) = (j; 1 x) : j mod i = 0 (j; x) : otherwise (A.43) The i-th operation on all lights can be realized as mapping again: map(switch(i);L) (A.44) Note that, here we use Curried form of switch() function, which is equivalent to map((j;x) switch(i; (j; x));L) Here we need de

2876. ne a function proc(), which can perform the above mapping on L over and over by N times. One option is to realize it in purely recursive way as the following, so that we can call it like proc(f1; 2; :::;Ng;L)7. proc(I;L) = L : I = operate(I0; map(switch(i1);L)) : otherwise (A.45) Where I = cons(i1; I0) if I isn't empty. At this stage, we can sum the second value of each pair in list L to get the answer. The sum function has been de

2877. ned in previous section, so the only thing left is mapping. solve(N) = sum(map(snd; proc(f1; 2; :::;Ng;L))) (A.46) Translating this naive brute-force solution to Haskell yields below program. solve' = sum (map snd) proc where proc n = operate [1..n] $ map (i ! (i, 0)) [1..n] operate [] xs = xs operate (i:is) xs = operate is (map (switch i) xs) Let's see what's the answer for there are 1, 2, ..., 100 lights. [1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5, 6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,10] 6Readers who are familiar with functional programming, may use zipping to achieve this. We'll explain zipping in later section. 7This can also be realized by folding, which will be explained in later section.

2878. A.4. TRANSFORMATION 579 This result is interesting: the

2879. rst 3 answers are 1; the 4-th to the 8-th answers are 2; the 9-th to the 15-th answers are 3; ... It seems that the i2-th to the ((i + 1)2 1)-th answers are i. Actually, we can prove this fact as the following. Proof. Given N lights, labeled from 1 to N, consider which lights are on

2880. nally. Since the initial states for all lights are o, we can say that, the lights which are manipulated odd times are on. For every light i, it will be switched at the j round if i can be divided by j (denote as jji). So only the lights which have odd number of factors are on at the end. So the key point to solve this puzzle, is to

2881. nd all numbers which have odd number of factors. For any positive integer N, denote S the set of all factors of N. S is initialized to . if p is a factor of N, there must exist a positive integer q that N = pq, which means q is also a factor of N. So we add 2 dierent factors to the set S if and only if p6= q, which keeps jSj even all the time unless p = q. In such case, N is a perfect square number, and we can only add 1 factor to the set S, which leads to an odd number of factors. At this stage, we can design a fast solution by

2882. nding the number of perfect square numbers under N. solve(N) = b p Nc (A.47) The next Haskell command veri

2883. es that the answer for 1, 2, ..., 100 lights are as same as above. map (floor.sqrt) [1..100] [1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5, 6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,10] Mapping is generic concept that it doesn't only limit in linked-list, but also can be applied to many complex data structures. The chapter about binary search tree in this book explains how to map on trees. As long as we can traverse a data structure in some order, and the empty data structure can be identi

2884. ed, we can use the same mapping idea. We'll return to this kind of high-order concept in the section of folding later. A.4.2 reverse How to reverse a singly linked-list with minimum space is a popular techni-cal interview problem in some companies. The pointer manipulation must be arranged carefully in imperative programming languages such as ANSI C. How-ever, we'll show that, there exists an easy way to write this program: 1. Firstly, write a pure recursive straightforward solution;

2885. 580 APPENDIX A. LISTS 2. Then, transform the pure recursive solution to tail-call manner; 3. Finally, translate the tail-call solution to pure imperative pointer opera-tions. The pure recursive solution is simple enough that we can write it out imme-diately. In order to reverse a list L. If L is empty, the reversed result is empty. This is the trivial edge case; Otherwise, we can

2886. rst reverse the rest of the sub-list, then append the

2887. rst element to the end. This idea can be formalized to the below equation. reverse(L) = : L = append(reverse(L0); l1) : otherwise (A.48) Translating it to Haskell yields below program. reverse [] = [] reverse (x:xs) = reverse xs ++ [x] However, this solution doesn't perform well, as appending has to traverse to the end of list, which leads to a quadratic time algorithm. It is not hard to improve this program by changing it to tail-call manner. That we can use a accu-mulator to store the intermediate reversed result, and initialize the accumulated result as empty. So the algorithm is formalized as reverse(L) = reverse0(L; ). 0 (L;A) = reverse A : L = reverse0(L0; fl1g [ A) : otherwise (A.49) Where fl1g[A means cons(l1;A). Dierent from appending, it's a constant O(1) time operation. The core idea is that we repeatedly take the element one by one from the head of the original list, and put them in front the accumulated result. This is just like we store all the elements in a stack, them pop them out. This is a linear time algorithm. Below Haskell program implements this tail-call version. reverse' [] acc = acc reverse' (x:xs) acc = reverse' xs (x:acc) Since the nature of tail-recursion call needn't book-keep any context (typi-cally by stack), most modern compilers are able to optimize it to a pure imper-ative loop, and reuse the current context and stack etc. Let's manually do this optimization so that we can get a imperative algorithm. function Reverse(L) A while L6= do A Cons(First(L), A) L Rest(L)

2888. A.5. EXTRACT SUB-LISTS 581 However, because we translate it directly from a functional solution, this algorithm actually produces a new reversed list, but does not mutate the original one. It is not hard to change it to an in-place solution by reusing L. For example, the following ISO C++ program implements the in-place algorithm. It takes O(1) memory space, and reverses the list in O(N) time. templatetypename T ListT reverse(ListT xs) { ListT p, ys = NULL; while (xs) { p = xs; xs = xs!next; p!next = ys; ys = p; } return ys; } Exercise A.2 Implement the algorithm to

2889. nd the maximum element in a list of pair in tail call approach in your favorite programming language. A.5 Extract sub-lists Dierent from arrays which are capable to slice a continuous segment fast and easily, It needs more work to extract sub lists from singly linked list. Such operations are typically linear algorithms. A.5.1 take, drop, and split-at Taking

2890. rst N elements from a list is semantically similar to extract sub list from the very left like sublist(L; 1;N), where the second and the third arguments to sublist are the positions the sub-list starts and ends. For the trivial edge case, that either N is zero or the list is empty, the sub list is empty; Otherwise, we can recursively take the

2891. rst N 1 elements from the rest of the list, and put the

2892. rst element in front of it. take(N;L) = : L = _ N = 0 cons(l1; take(N 1;L0)) : otherwise (A.50) Note that the edge cases actually handle the out-of-bound error. The fol-lowing Haskell program implements this algorithm. take _ [] = [] take 0 _ = [] take n (x:xs) = x : take (n-1) xs Dropping on the other hand, drops the

2893. rst N elements and returns the left as result. It is equivalent to get the sub list from right like sublist(L;N +1; jLj),

2894. 582 APPENDIX A. LISTS where jLj is the length of the list. Dropping can be designed quite similar to taking by discarding the

2895. rst element in the recursive case. drop(N;L) = 8 : : L = L : N = 0 drop(N 1;L0)) : otherwise (A.51) Translating the algorithm to Haskell gives the below example program. drop _ [] = [] drop 0 L = L drop n (x:xs) = drop (n-1) xs The imperative taking and dropping are quite straight-forward, that they are left as exercises to the reader. With taking and dropping de

2896. ned, extracting sub list at arbitrary position for arbitrary length can be realized by calling them. sublist(L; from; count) = take(count; drop(from 1;L)) (A.52) or in another semantics by providing left and right boundaries: sublist(L; from; to) = drop(from 1; take(to;L)) (A.53) Note that the elements in range [from; to] is returned by this function, with both ends included. All the above algorithms perform in linear time. take-while and drop-while Compare to taking and dropping, there is another type of operation, that we either keep taking or dropping elements as far as a certain condition is met. The taking and dropping algorithms can be viewed as special cases for take-while and drop-while. Take-while examines elements one by one as far as the condition is satis

2897. ed, and ignore all the rest of elements even some of them satisfy the condition. This is the dierent point from

2898. ltering which we'll explained in later section. Take-while stops once the condition tests fail; while

2899. ltering traverses the whole list. takeWhile(p;L) = 8 : : L = : :p(l1) cons(l1; takeWhile(p;L0)) : otherwise (A.54) Take-while accepts two arguments, one is the predicate function p, which can be applied to element in the list and returns Boolean value as result; the other argument is the list to be processed. It is easy to de

2900. ne the drop-while symmetrically. dropWhile(p;L) = 8 : : L = L : :p(l1) dropWhile(p;L0) : otherwise (A.55) The corresponding Haskell example programs are given as below.

2901. A.5. EXTRACT SUB-LISTS 583 takeWhile _ [] = [] takeWhile p (x:xs) = if p x then x : takeWhile p xs else [] dropWhile _ [] = [] dropWhile p xs@(x:xs') = if p x then dropWhile p xs' else xs split-at With taking and dropping de

2902. ned, splitting-at can be realized trivially by calling them. splitAt(i;L) = (take(i;L); drop(i;L)) (A.56) A.5.2 breaking and grouping breaking Breaking can be considered as a general form of splitting, instead of splitting at a given position, breaking examines every element for a certain predicate, and

2903. nds the longest pre

2904. x of the list for that condition. The result is a pair of sub-lists, one is that longest pre

2905. x, the other is the rest. There are two dierent breaking semantics, one is to pick elements satisfying the predicate as long as possible; the other is to pick those don't satisfy. The former is typically de

2906. ned as span, while the later as break. Span can be described, for example, in such recursive manner: In order to span a list L for predicate p: If the list is empty, the result for this edge trivial case is a pair of empty lists (; ); Otherwise, we test the predicate against the

2907. rst element l1, if l1 satis

2908. es the predicate, we denote the intermediate result for spanning the rest of list as (A;B) = span(p;L0), then we put l1 in front of A to get pair (fl1g [ A;B), otherwise, we just return (;L) as the result. For breaking, we just test the negate of predicate and all the others are as same as spanning. Alternatively, one can de

2909. ne breaking by using span as in the later example program. span(p;L) = 8 : (; ) : L = (fl1g [ A;B) : p(l1) = True; (A;B) = span(p;L0) (;L) : otherwise (A.57) break(p;L) = 8 : (; ) : L = (fl1g [ A;B) : :p(l1); (A;B) = break(p;L0) (;L) : otherwise (A.58) Note that both functions only

2910. nd the longest pre

2911. x, they stop immediately when the condition is fail even if there are elements in the rest of the list meet the predicate (or not). Translating them to Haskell gives the following example program.

2912. 584 APPENDIX A. LISTS span _ [] = ([], []) span p xs@(x:xs') = if p x then let (as, bs) = span xs' in (x:as, bs) else ([], xs) break p = span (not p) Span and break can also be realized imperatively as the following. function Span(p;L) A while L6= ^ p(l1) do A Cons(l1;A) L Rest(L) return (A;L) function Break(p;L) return Span(:p;L) This algorithm creates a new list to hold the longest pre

2913. x, another option is to turn it into in-place algorithm to reuse the spaces as in the following Python example. def span(p, xs): ys = xs last = None while xs is not None and p(xs.key): last = xs xs = xs.next if last is None: return (None, xs) last.next = None return (ys, xs) Note that both span and break need traverse the list to test the predicate, thus they are linear algorithms bound to O(N). grouping Grouping is a commonly used operation to solve the problems that we need divide the list into some small groups. For example, Suppose we want to group the string `Mississippi', which is actual a list of char f 'M', 's', 's', 'i', 's', 's', 'i', 'p', 'p', 'i'g. into several small lists in sequence, that each one contains consecutive identical characters. The grouping operation is expected to be: group(`Mississippi') = { `M'', ì', `ss', ì', `ss', ì', `pp', ì'} Another example, is that we have a list of numbers: L = f15; 9; 0; 12; 11; 7; 10; 5; 6; 13; 1; 4; 8; 3; 14; 2g We want to divide it into several small lists, that each sub-list is ordered descending. The grouping operation is expected to be : group(L) = ff15; 9; 0g; f12; 11; 7g; f10; 5g; f6g; f13; 1g; f4g; f8; 3g; f14; 2gg

2914. A.5. EXTRACT SUB-LISTS 585 Both cases play very important role in real algorithms. The string grouping is used in creating Trie/Patricia data structure, which is a powerful tool in string searching area; The ordered sub-list grouping can be used in nature merge sort. There are dedicated chapters in this book explain the detail of these algorithms. It is obvious that we need abstract the grouping condition so that we know where to break the original list into small ones. This predicate can be passed to the algorithm as an argument like group(p;L), where predicate p accepts two consecutive elements and test if the condition matches. The

2915. rst idea to solve the grouping problem is traversing { takes two elements at each time, if the predicate test succeeds, put both elements into a small group; otherwise, only put the

2916. rst one into the group, and use the second one to initialize another new group. Denote the

2917. rst two elements (if there are) are l1; l2, and the sub-list without the

2918. rst element as L0. The result is a list of list G = fg1; g2; :::g, denoted as G = group(p;L). group(p;L) = 8 : fg : L = ffl1gg : jLj = 1 ffl1g [ g0 1; g0 2; :::g : p(l1; l2);G0 = group(p;L0) = fg0 1; g0 2; :::g ffl1g; g0 1; g0 2; :::g : otherwise (A.59) Note that fl1g [ g0 1 actually means cons(l1; g0 1), which performs in constant time. This is a linear algorithm performs proportion to the length of the list, it traverses the list in one pass which is bound to O(N). Translating this program to Haskell gives the below example code. group _ [] = [[]] group _ [x] = [[x]] group p (x:xs@(x':_)) j p x x' = (x:ys):yss j otherwise = [x]:r where r@(ys:yss) = group p xs It is possible to implement this algorithm in imperative approach, that we initialize the result groups as fl1g if L isn't empty, then we traverse the list from the second one, and append to the last group if the two consecutive elements satisfy the predicate; otherwise we start a new group. function Group(p;L) if L = then return fg x First(L) L Rest(L) g fxg G fgg while L6= do y First(L) if p(x; y) then g Append(g; y) else g fyg G Append(G; g)

2919. 586 APPENDIX A. LISTS x y L Next(L) return G However, dierent from the recursive algorithm, this program performs in quadratic time if the appending function isn't optimized by storing the tail position. The corresponding Python program is given as below. def group(p, xs): if xs is None: return List(None) (x, xs) = (xs.key, xs.next) g = List(x) G = List(g) while xs is not None: y = xs.key if p(x, y): g = append(g, y) else: g = List(y) G = append(G, g) x = y xs = xs.next return G With the grouping function de

2920. ned, the two example cases mentioned at the beginning of this section can be realized by passing dierent predictions. group(=; fm; i; s; s; i; s; s; i; p; p; ig) = ffMg; fig; fssg; fig; fssg; fig; fppg; figg group(; f15; 9; 0; 12; 11; 7; 10; 5; 6; 13; 1; 4; 8; 3; 14; 2g) = ff15; 9; 0g; f12; 11; 7g; f10; 5g; f6g; f13; 1g; f4g; f8; 3g; f14; 2gg Another solution is to use the span function we have de

2921. ned to realize group-ing. We pass a predicate to span, which will break the list into two parts: The

2922. rst part is the longest sub-list satisfying the condition. We can repeatedly apply the span with the same predication to the second part, till it becomes empty. However, the predicate function we passed to span is an unary function, that it takes an element as argument, and test if it satis

2923. es the condition. While in grouping algorithm, the predicate function is a binary function. It takes two adjacent elements for testing. The solution is that, we can use currying and pass the

2924. rst element to the binary predicate, and use it to test the rest of elements. group(p;L) = fg : L = ffl1g [ Ag [ group(p;B) : otherwise (A.60) Where (A;B) = span(x p(l1; x);L0) is the result of spanning on the rest sub-list of L. Although this new de

2925. ned grouping function can generate correct result for the

2926. rst case as in the following Haskell code snippet.

2927. A.5. EXTRACT SUB-LISTS 587 groupBy (==) Mississippi [m,i,ss,i,ss,i,pp,i] However, it seems that this algorithm can't group the list of numbers into ordered sub lists. groupBy () [15, 9, 0, 12, 11, 7, 10, 5, 6, 13, 1, 4, 8, 3, 14, 2] [[15,9,0,12,11,7,10,5,6,13,1,4,8,3,14,2]] The reason is because that, as the

2928. rst element 15 is used as the left param-eter to operator for span, while 15 is the maximum value in this list, so the span function ends with putting all elements to A, and B is left empty. This might seem a defect, but it is actually the correct behavior if the semantic is to group equal elements together. Strictly speaking, the equality predicate must satisfy three properties: re- exive, transitive, and symmetric. They are speci

2929. ed as the following. Re exive. x = x, which says that any element is equal to itself; Transitive. x = y; y = z ) x = z, which says that if two elements are equal, and one of them is equal to another, then all the tree are equal; Symmetric. x = y , y = x, which says that the order of comparing two equal elements doesn't aect the result. When we group character list Mississippi, the equal (=) operator is used, which obviously conforms these three properties. So that it generates correct grouping result. However, when passing () as equality predicate, to group a list of numbers, it violets both re exive and symmetric properties, that is reason why we get wrong grouping result. This fact means that the second algorithm we designed by using span, limits the semantic to strictly equality, while the

2930. rst one does not. It just tests the condition for every two adjacent elements, which is much weaker than equality. Exercise A.3 1. Implement the in-place imperative taking and dropping algorithms in your favorite programming language, note that the out of bound cases should be handled. Please try both languages with and without GC (Garbage Collection) support. 2. Implement take-while and drop-while in your favorite imperative program-ming language. Please try both dynamic type language and static type language (with and without type inference). How to specify the type of predicate function as generic as possible in static type system? 3. Consider the following de

2931. nition of span. span(p;L) = 8 : (; ) : L = (fl1g [ A;B) : p(l1) = True; (A;B) = span(p;L0) (A; fl1g [ B) : otherwise What's the dierence between this algorithm and the the one we've shown in this section? 4. Implement the grouping algorithm by using span in imperative way in your favorite programming language.

2932. 588 APPENDIX A. LISTS A.6 Folding We are ready to introduce one of the most critical concept in high order pro-gramming, folding. It is so powerful tool that almost all the algorithms so far in this appendix can be realized by folding. Folding is sometimes be named as reducing (the abstracted concept is identical to the buzz term `map-reduce' in cloud computing in some sense). For example, both STL and Python provide reduce function which realizes partial form of folding. A.6.1 folding from right Remind the sum and product de

2933. nition in previous section, they are quite sim-ilar actually. sum(L) = 0 : L = l1 + sum(L0) : otherwise product(L) = 1 : L = l1 product(L0) : otherwise It is obvious that they have same structure. What's more, if we list the insertion sort de

2934. nition, we can

2935. nd that it also shares this structure. sort(L) = : L = insert(l1; sort(L0)) : otherwise This hint us that we can abstract this essential common structure, so that we needn't repeat it again and again. Observing sum, product, and sort, there are two dierent points which we can parameterize. The result of the trivial edge case varies. It is zero for sum, 1 for product, and empty list for sorting. The function applied to the

2936. rst element and the intermediate result varies. It is plus for sum, multiply for product, and ordered-insertion for sorting. If we parameterize the result of trivial edge case as initial value z (stands for abstract zero concept), the function applied in recursive case as f (which takes two parameters, one is the

2937. rst element in the list, the other is the recursive result for the rest of the list), this common structure can be de

2938. ned as something like the following. proc(f; z;L) = z : L = f(l1; proc(f; z;L0)) : otherwise That's it, and we should name this common structure a better name instead of the meaningless `proc'. Let's see the characteristic of this common structure. For list L = fx1; x2; :::; xNg, we can expand the computation like the following. proc(f; z;L) = f(x1; proc(f; z;L0) = f(x1; f(x2; proc(f; z;L00)) ::: = f(x1; f(x2; f(:::; f(xN; f(f; z; )):::) = f(x1; f(x2; f(:::; f(xN; z)):::)

2939. A.6. FOLDING 589 Since f takes two parameters, it's a binary function, thus we can write it in in

2940. x form. The in

2941. x form is de

2942. ned as below. x f y = f(x; y) (A.61) The above expanded result is equivalent to the following by using in

2943. x no-tion. proc(f; z;L) = x1 f (x2 f (:::(xN f z)):::) Note that the parentheses are necessary, because the computation starts from the right-most (xN f z), and repeatedly fold to left towards x1. This is quite similar to folding a Chinese hand-fan as illustrated in the following photos. A Chinese hand-fan is made of bamboo and paper. Multiple bamboo frames are stuck together with an axis at one end. The arc shape paper is fully expanded by these frames as shown in Figure A.3 (a); The fan can be closed by folding the paper. Figure A.3 (b) shows that some part of the fan is folded from right. After these folding

2944. nished, the fan results a stick, as shown in Figure A.3 (c). (a) A folding fan fully opened. (b) The fan is partly folded on right. (c) The fan is fully folded, closed to a stick. Figure A.3: Folding a Chinese hand-fan We can considered that each bamboo frame along with the paper on it as an element, so these frames forms a list. A unit process to close the fan is to rotate

2945. 590 APPENDIX A. LISTS a frame for a certain angle, so that it lays on top of the collapsed part. When we start closing the fan, the initial collapsed result is the

2946. rst bamboo frame. The close process is folding from one end, and repeatedly apply the unit close steps, till all the frames is rotated, and the folding result is a stick closed form. Actually, the sum and product algorithms exactly do the same thing as closing the fan. sum(f1; 2; 3; 4; 5g) = 1 + (2 + (3 + (4 + 5))) = 1 + (2 + (3 + 9)) = 1 + (2 + 12) = 1 + 14 = 15 product(f1; 2; 3; 4; 5g) = 1 (2 (3 (4 5))) = 1 (2 (3 20)) = 1 (2 60) = 1 120 = 120 In functional programming, we name this process folding, and particularly, since we execute from the most inner structure, which starts from the right-most one. This type of folding is named folding right. foldr(f; z;L) = z : L = f(l1; foldr(f; z;L0)) : otherwise (A.62) Let's see how to use fold-right to realize sum and product. PN i=1 xi = x1 + (x2 + (x3 + ::: + (xN1 + xN)):::) = foldr(+; 0; fx1; x2; :::; xNg) (A.63) QN i=1 xi = x1 (x2 (x3 ::: + (xN1 xN)):::) = foldr(; 1; fx1; x2; :::; xNg) (A.64) The insertion-sort algorithm can also be de

2947. ned by using folding right. sort(L) = foldr(insert;;L) (A.65) A.6.2 folding from left As mentioned in section of `tail recursive` call. Both pure recursive sum and product compute from right to left and they must book keep all the intermediate results and contexts. As we abstract fold-right from the very same structure, folding from right does the book keeping as well. This will be expensive if the list is very long. Since we can change the realization of sum and product to tail-recursive call manner, it quite possible that we can provide another folding algorithm, which processes the list from left to right in normal order, and enable the tail-call optimization by reusing the same context.

2948. A.6. FOLDING 591 Instead of induction from sum, product and insertion, we can directly change the folding right to tail call. Observe that the initial value z, actually represents the intermediate result at any time. We can use it as the accumulator. foldl(f; z;L) = z : L = foldl(f; f(z; l1);L0) : otherwise (A.66) Every time when the list isn't empty, we take the

2949. rst element, apply function f on the accumulator z and it to get a new accumulator z0 = f(z; l1). After that we can repeatedly folding with the very same function f, the updated accumulator z0, and list L0. Let's verify that this tail-call algorithm actually folding from left. P5 i=1 i = foldl(+; 0; f1; 2; 3; 4; 5g) = foldl(+; 0 + 1; f2; 3; 4; 5g) = foldl(+; (0 + 1) + 2f3; 4; 5g = foldl(+; ((0 + 1) + 2) + 3; f4; 5g) = foldl(+; (((0 + 1) + 2) + 3) + 4; f5g) = foldl(+; ((((0 + 1) + 2 + 3) + 4 + 5; ) = 0 + 1 + 2 + 3 + 4 + 5 Note that, we actually delayed the evaluation of f(z; l1) in every step. (This is the exact behavior in system support lazy-evaluation, for instance, Haskell. However, in strict system such as standard ML, it's not the case.) Actually, they will be evaluated in sequence of f1; 3; 6; 10; 15g in each call. Generally, folding-left can be expanded in form of foldl(f; z;L) = f(f(:::(f(f(z; l1); l2); :::; lN) (A.67) Or in in

2950. x manner as foldl(f; z;L) = ((:::(z f l1) f l2) f :::) lN (A.68) With folding from left de

2951. ned, sum, product, and insertion-sort can be trans-parently implemented by calling foldl as sum(L) = foldl(+; 0;L), product(L) = foldl(+; 1;L), and sort(L) = foldl(insert;;L). Compare with the folding-right version, they are almost same at

2952. rst glares, however, the internal imple-mentation diers. Imperative folding and generic folding concept The tail-call nature of folding-left algorithm is quite friendly for imperative set-tings, that even the compiler isn't equipped with tail-call recursive optimization, we can anyway implement the folding in while-loop manually. function Fold(f; z;L) while L6= do z f(z; First(L) ) L Rest(L) return z Translating this algorithm to Python yields the following example program.

2953. 592 APPENDIX A. LISTS def fold(f, z, xs): for x in xs: z = f(z, x) return z Actually, Python provides built-in function `reduce' which does the very same thing. (in ISO C++, this is provided as reduce algorithm in STL.) Almost no imperative environment provides folding-right function because it will cause stack over ow problem if the list is too long. However, there still exist cases that the folding from right semantics is necessary. For example, one de

2954. nes a container, which only provides insertion function to the head of the container, but there is no any appending method, so that we want such a fromList tool. fromList(L) = foldr(insertHead; empty;L) Calling fromList with the insertion function as well as an empty initialized container, can turn a list into the special container. Actually the singly linked-list is such a container, which performs well on insertion to the head, but poor to linear time if appending on the tail. Folding from right is quite nature when duplicate a linked-list while keeps the elements ordering. While folding from left will generate a reversed list. In such cases, there exists an alternative way to implement imperative folding right by

2955. rst reverse the list, and then folding the reversed one from left. function Fold-Right(f; z;L) return Fold(f; z, Reverse(L)) Note that, here we must use the tail-call version of reversing, or the stack over ow issue still exists. One may think that folding-left should be chosen in most cases over folding-right because it's friendly for tail-recursion call optimization, suitable for both functional and imperative settings, and it's an online algorithm. However, folding-right plays a critical role when the input list is in

2956. nity and the binary function f is lazy. For example, below Haskell program wraps every element in an in

2957. nity list to a singleton, and returns the

2958. rst 10 result. take 10 $ foldr (x xs ![x]:xs) [] [1..] [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]] This can't be achieved by using folding left because the outer most evaluation can't be

2959. nished until all the list being processed. The details is speci

2960. c to lazy evaluation feature, which is out of the scope of this book. Readers can refer to [13] for details. Although the main topic of this appendix is about singly linked-list related algorithms, the folding concept itself is generic which doesn't only limit to list, but also can be applied to other data structures. We can fold a tree, a queue, or even more complicated data structures as long as we have the following: The empty data structure can be identi

2961. ed for trivial edge case; (e.g. empty tree) We can traverse the data structure (e.g. traverse the tree in pre-order).

2962. A.6. FOLDING 593 Some languages provide this high-level concept support, for example, Haskell achieve this via monoid, readers can refer to [8] for detail. There are many chapters in this book use the widen concept of folding. A.6.3 folding in practice We have seen that max, min, and insertion sort all can be realized in folding. The brute-force solution for `drunk jailer' puzzle shown in mapping section can also be designed by mixed use of mapping and folding. Remind that we create a list of pairs, each pair contains the number of the light, and the on-o state. After that we process from 1 to N, switch the light if the number can be divided. The whole process can be viewed as folding. fold(step; f(1; 0); (2; 0); :::; (N; 0)g; f1; 2; :::;Ng) The initial value is the very

2963. rst state, that all the lights are o. The list to be folding is the operations from 1 to N. Function step takes two arguments, one is the light states pair list, the other is the operation time i. It then maps on all lights and performs switching. We can then substitute the step with mapping. fold(L;i map(switch(i);L); f(1; 0); (2; 0); :::; (N; 0)g; f1; 2; :::;Ng) We'll simplify the notation, and directly write map(switch(i); l) for brevity purpose. The result of this folding is the

2964. nal states pairs, we need take the second one of the pair for each element via mapping, then calculate the sum-mation. sum(map(snd; fold(map(switch(i);L); f(1; 0); (2; 0); :::; (N; 0)g; f1; 2; :::;Ng))) (A.69) There are materials provides plenty of good examples of using folding, espe-cially in [1], folding together with fusion law are well explained. concatenate a list of list In previous section A.3.6 about concatenation, we explained how to concate-nate two lists. Actually, concatenation of lists can be considered equivalent to summation of numbers. Thus we can design a general algorithm, which can concatenate multiple lists into one big list. What's more, we can realize this general concatenation by using folding. As sum can be represented as sum(L) = foldr(+; 0;L), it's straightforward to write the following equation. concats(L) = foldr(concat;;L) (A.70) Where L is a list of list, for example ff1; 2; 3g; f4; 5; 6g; f7; 8; 9g; :::g. Func-tion concat(L1;L2) is what we de

2965. ned in section A.3.6. In some environments which support lazy-evaluation, such as Haskell, this algorithm is capable to concatenate in

2966. nite list of list, as the binary function ++ is lazy. Exercise A.4

2967. 594 APPENDIX A. LISTS What's the performance of concats algorithm? is it linear or quadratic? Design another linear time concats algorithm without using folding. Realize mapping algorithm by using folding. A.7 Searching and matching Searching and matching are very important algorithms. They are not only limited to linked list, but also applicable to a wide range of data structures. We just scratch the surface of searching and matching in this appendix. There are dedicated chapters explain about them in this book. A.7.1 Existence testing The simplest searching case is to test if a given element exists in a list. A linear time traverse can solve this problem. In order to determine element x exists in list L: If the list is empty, it's obvious that the element doesn't exist in L; If the

2968. rst element in the list equals to x, we know that x exists; Otherwise, we need recursively test if x exists in the rest sub-list L0; This simple description can be directly formalized to equation as the follow-ing. x 2 L = 8 : False : L = True : l1 = x x 2 L0 : otherwise (A.71) This is de

2969. nitely a linear algorithm which is bound to O(N) time. The best case happens in the two trivial clauses that either the list is empty or the

2970. rst element is what we are

2971. nding; The worst case happens when the element doesn't exist at all or it is the last element. In both cases, we need traverse the whole list. If the probability is equal for all the positions, the average case takes about N+1 2 steps for traversing. This algorithm is so trivial that we left the implementation as exercise to the reader. If the list is ordered, one may expect to improve the algorithm to logarithm time but not linear. However, as we discussed, since list doesn't support constant time random accessing, binary search can't be applied here. There is a dedicated chapter in this book discusses how to evolve the linked list to binary tree to achieve quick searching. A.7.2 Looking up One extra step from existence testing is to

2972. nd the interesting information stored in the list. There are two typical methods to augment extra data to the element. Since the linked list is chain of nodes, we can store satellite data in the node, then provide key(n) to access the key of the node, rest(n) for the rest sub-list, and value(n) for the augmented data. The other method, is to pair the key

2973. A.7. SEARCHING AND MATCHING 595 and data, for example f(1; hello); (2;world); (3; foo); :::g. We'll introduce how to form such pairing list in later section. The algorithm is almost as same as the existence testing, that it traverses the list, examines the key one by one. Whenever it

2974. nds a node which has the same key as what we are looking up, it stops, and returns the augmented data. It is obvious that this is linear strategy. If the satellite data is augmented to the node directly, the algorithm can be de

2975. ned as the following. lookup(x;L) = 8 : : L = value(l1) : key(l1) = x lookup(x;L0) : otherwise (A.72) In this algorithm, L is a list of nodes which are augmented with satellite data. Note that the

2976. rst case actually means looking up failure, so that the result is empty. Some functional programming languages, such as Haskell, provide Maybe type to handle the possibility of fail. This algorithm can be slightly modi

2977. ed to handle the key-value pair list as well. lookup(x;L) = 8 : : L = snd(l1) : fst(l1) = x lookup(x;L0) : otherwise (A.73) Here L is a list of pairs, functions fst(p) and snd(p) access the

2978. rst part and second part of the pair respectively. Both algorithms are in tail-call manner, they can be transformed to imper-ative looping easily. We left this as exercise to the reader. A.7.3

2979. nding and

2980. ltering Let's take one more step ahead, looking up algorithm performs linear search by comparing the key of an element is equal to the given value. A more general case is to

2981. nd an element matching a certain predicate. We can abstract this matching condition as a parameter for this generic linear

2982. nding algorithm. find(p;L) = 8 : : L = l1 : p(l1) find(p;L0) : otherwise (A.74) The algorithm traverses the list by examining if the element satis

2983. es the predicate p. It fails if the list is empty while there is still nothing found. This is handled in the

2984. rst trivial edge case; If the

2985. rst element in the list satis

2986. es the condition, the algorithm returns the whole element (node), and user can further handle it as he like (either extract the satellite data or do whatever); otherwise, the algorithm recursively perform

2987. nding on the rest of the sub-list. Below is the corresponding Haskell example program. find _ [] = Nothing find p (x:xs) = if p x then Just x else find p xs Translating this to imperative algorithm is straightforward. Here we use 'NIL' to represent the fail case. function Find(p;L) while L6= do

2988. 596 APPENDIX A. LISTS if p(First(L)) then return First(L) L Rest(L) return NIL And here is the Python example of

2989. nding. def find(p, xs): while xs is not None: if p(xs.key): return xs xs = xs.next return None It is quite possible that there are multiple elements in the list which satisfy the precondition. The

2990. nding algorithm designed so far just picks the

2991. rst one it meets, and stops immediately. It can be considered as a special case of

2992. nding all elements under a certain condition. Another viewpoint of

2993. nding all elements with a given predicate is to treat the

2994. nding algorithm as a black box, the input to this box is a list, while the output is another list contains all elements satisfying the predicate. This can be called as

2995. ltering as shown in the below

2996. gure. input filter p output Figure A.4: The input is the original list fx1; x2; :::; xNg, the output is a list fx0 1; x0 2; :::; x0 M g, that for 8x0 i, predicate p(x0 i) is satis

2997. ed. This

2998. gure can be formalized in another form in taste of set enumeration. However, we actually enumerate among list instead of a set. filter(p;L) = fxjx 2 L ^ p(x)g (A.75) Some environment such as Haskell (and Python for any iterable), supports this form as list comprehension. filter p xs = [ x j x xs, p x] And in Python for built-in list as def filter(p, xs): return [x for x in xs if p(x)] Note that the Python built-in list isn't singly-linked list as we mentioned in this appendix. In order to modify the

2999. nding algorithm to realize

3000. ltering, the found ele-ments are appended to a result list. And instead of stopping the traverse, all the rest of elements should be examined with the predicate. filter(p;L) = 8 : : L = cons(l1; filter(p;L0)) : p(l1) filter(p;L0) : otherwise (A.76)

3001. A.7. SEARCHING AND MATCHING 597 This algorithm returns empty result if the list is empty for trivial edge case; For non-empty list, suppose the recursive result of

3002. ltering the rest of the sub-list is A, the algorithm examine if the

3003. rst element satis

3004. es the predicate, it is put in front of A by a `cons' operation (O(1) time). The corresponding Haskell program is given as below. filter _ [] = [] filter p (x:xs) = if p x then x : filter p xs else filter p xs Although we mentioned that the next found element is `appended' to the result list, this algorithm actually constructs the result list from the right most to the left, so that appending is avoided, which ensure the linear O(N) perfor-mance. Compare this algorithm with the following imperative quadratic real-ization reveals the dierence. function Filter(p;L) L0 while L6= do if p(First(L)) then L0 Append(L0, First(L)) . Linear operation L Rest(L) As the comment of appending statement, it's typically proportion to the length of the result list if the tail position isn't memorized. This fact indicates that directly transforming the recursive

3005. lter algorithm into tail-call form will downgrade the performance from O(N) to O(N2). As shown in the below equa-tion, that filter(p;L) = filter0(p; L; ) performs as poorly as the imperative one. filter 0 (p; L;A) = 8 : A : L = filter0(p;L0;A [ fl1g) : p(l1) filter0(p;L0;A) : otherwise (A.77) One solution to achieve linear time performance imperatively is to construct the result list in reverse order, and perform the O(N) reversion again (refer to the above section) to get the

3006. nal result. This is left as exercise to the reader. The fact of construction the result list from right to left indicates the pos-sibility of realizing

3007. ltering with folding-right concept. We need design some combinator function f, so that filter(p;L) = foldr(f;;L). It requires that function f takes two arguments, one is the element iterated among the list; the other is the intermediate result constructed from right. f(x;A) can be de

3008. ned as that it tests the predicate against x, if succeed, the result is updated to cons(x;A), otherwise, A is kept same. f(x;A) = cons(x;A) : p(x) A : otherwise (A.78) However, the predicate must be passed to function f as well. This can be achieved by using currying, so f actually has the prototype f(p; x;A), and

3009. ltering is de

3010. ned as following. filter(p;L) = foldr(x;A f(p; x;A);;L) (A.79)

3011. 598 APPENDIX A. LISTS Which can be simpli

3012. ed by -conversion. For detailed de

3013. nition of -conversion, readers can refer to [2]. filter(p;L) = foldr(f(p);;L) (A.80) The following Haskell example program implements this equation. filter p = foldr f [] where f x xs = if p x then x : xs else xs Similar to mapping and folding,

3014. ltering is actually a generic concept, that we can apply a predicate on any traversable data structures to get what we are interesting. readers can refer to the topic about monoid in [8] for further reading. A.7.4 Matching Matching generally means to

3015. nd a given pattern among some data structures. In this section, we limit the topic within list. Even this limitation will leads to a very wide and deep topic, that there are dedicated chapters in this book introduce matching algorithms. So we only select the algorithm to test if a given list exists in another (typically longer) list. Before dive into the algorithm of

3016. nding the sub-list at any position, two special edge cases are used for warm up. They are algorithms to test if a given list is either pre

3017. x or sux of another. In the section about span, we have seen how to

3018. nd a pre

3019. x under a certain condition. pre

3020. x matching can be considered as a special case in some sense. That it compares each of the elements between the two lists from the beginning until meets any dierent elements or pass the end of one list. De

3021. ne P L if P is pre

3022. x of L. P L = 8 : True : P = False : p16= l1 P0 L0 : otherwise (A.81) This is obviously a linear algorithm. However, We can't use the very same approach to test if a list is sux of another because it isn't cheap to start from the end of the list and keep iterating backwards. Arrays, on the other hand which support random access can be easily traversed backwards. As we only need the yes-no result, one solution to realize a linear sux testing algorithm is to reverse both lists, (which is linear time), and use pre

3023. x testing instead. De

3024. ne L P if P is sux of L. L P = reverse(P) reverse(L) (A.82) With de

3025. ned, it enables to test if a list is in

3026. x of another. The idea is to traverse the target list, and repeatedly applying the pre

3027. x testing till any success or arrives at the end. function Is-Infix(P;L) while L6= do if P L then return TRUE L Rest(L)

3028. A.7. SEARCHING AND MATCHING 599 return FALSE Formalize this algorithm to recursive equation leads to the below de

3029. nition. infix?(P;L) = 8 : True : P L False : L = infix?(P;L0) : otherwise (A.83) Note that there is a tricky implicit constraint in this equation. If the pattern P is empty, it is de

3030. nitely the in

3031. x of any target list. This case is actually covered by the

3032. rst condition in the above equation because empty list is also the pre

3033. x of any list. In most programming languages support pattern matching, we can't arrange the second clause as the

3034. rst edge case, or it will return false for infix?(; ). (One exception is Prolog, but this is a language speci

3035. c feature, which we won't covered in this book.) Since pre

3036. x testing is linear, and it is called while traversing the list, this algorithm is quadratic O(N M). where N and M are the length of the pattern and target lists respectively. There is no trivial way to improve this `position by position' scanning algorithm to linear even if the data structure changes from linked-list to randomly accessible array. There are chapters in this book introduce several approaches for fast match-ing, including sux tree with Ukkonen algorithm, Knuth-Morris-Pratt algo-rithm and Boyer-Moore algorithm. Alternatively, we can enumerate all suxes of the target list, and check if the pattern is pre

3037. x of any these suxes. Which can be represented as the following. infix?(P;L) = 9S 2 suffixes(L) ^ P S (A.84) This can be represented as list comprehension, for example the below Haskell program. isInfixOf x y = (not null) [ s j s tails(y), x `isPrefixOf`s] Where function isPrefixOf is the pre

3038. xing testing function de

3039. ned accord-ing to our previous design. function tails generate all suxes of a list. The implementation of tails is left as an exercise to the reader. Exercise A.5 Implement the linear existence testing in both functional and imperative approaches in your favorite programming languages. Implement the looking up algorithm in your favorite imperative program-ming language. Realize the linear time

3040. ltering algorithm by

3041. rstly building the result list in reverse order, and

3042. nally reverse it to resume the normal result. Implement this algorithm in both imperative looping and functional tail-recursion call. Implement the imperative algorithm of pre

3043. x testing in your favorite pro-gramming language. Implement the algorithm to enumerate all suxes of a list.

3044. 600 APPENDIX A. LISTS A.8 zipping and unzipping It is quite common to construct a list of paired elements. For example, in the naive brute-force solution for 'Drunk jailer' puzzle which is shown in section of mapping, we need to represent the state of all lights. It is initialized as f(1; 0); (2; 0); :::; (N; 0)g. Another example is to build a key-value list, such as f(1; a); (2; an); (3; another); :::g. In 'Drunk jailer' example, the list of pairs is built like the following. map(i (i; 0); f1; 2; :::;Ng) The more general case is that, There have been already two lists prepared, what we need is a handy `zipper' method. zip(A;B) = : A = _ B = cons((a1; b1); zip(A0;B0)) : otherwise (A.85) Note that this algorithm is capable to handle the case that the two lists being zipped have dierent lengths. The result list of pairs aligns with the shorter one. And it's even possible to zip an in

3045. nite list with another one with limited length in environment support lazy evaluation. For example with this auxiliary function de

3046. ned, we can initialize the lights state as zip(f0; 0; :::g; f1; 2; :::;Ng In some languages support list enumeration, such as Haskell (Python pro-vides similar range function, but it manipulates built-in list, which isn't linked-list actually), this can be expressed as zip (repeat 0) [1..n]. Given a list of words, we can also index them with consecutive numbers as zip(f1; 2; :::g; fa; an; another; :::g) Note that the zipping algorithm is linear, as it uses constant time `cons' op-eration in each recursive call. However, directly translating zip into imperative manner would down-grade the performance to quadratic unless the linked-list is optimized with tail position cache or we in-place modify one of the passed-in list. function Zip(A;B) C while A6= ^ B6= do C Append(C, (First(A), First(B))) A Rest(A) B Rest(B) return C Note that, the appending operation is proportion to the length of the result list C, so it will get more and more slowly along with traversing. There are three solutions to improve this algorithm to linear time. The

3047. rst method is to use a similar approach as we did in in

3048. x-testing, that we construct the result list of pairs in reverse order by always insert the paired elements on head; then perform a linear reverse operation before return the

3049. nal result; The second method is

3050. A.8. ZIPPING AND UNZIPPING 601 to modify one passed-in list, for example A, in-place while traversing. Translate it from list of elements to list of pairs; The third method is to remember the last appending position. Please try these solutions as exercise. The key point of linear time zipping is that the result list is actually built from right to left, which is similar to the in

3051. x-testing algorithm. So it's quite possible to provide a folding-right realization. This is left as exercise to the reader. It is natural to extend the zipper algorithm so that multiple lists can be zipped to one list of multiple-elements. For example, Haskell standard library provides, zip, zip3, zip4, ..., till zip7. Another typical extension to zipper is that, sometimes, we don't want to list of pairs (or tuples more generally), instead, we want to apply some combinator function to each pair of elements. For example, consider the case that we have a list of unit prices for every fruit: apple, orange, banana, ..., as f1:00; 0:80; 10:05; :::g, with same unit of Dollar; And the cart of customer holds a list of purchased quantity, for instance f3; 1; 0; :::g, means this customer, put 3 apples, an orange in the cart. He doesn't take any banana, so the quantity of banana is zero. We want to generate a list of cost for the customer, contains how much should pay for apple, orange, banana,... respectively. The program can be written from scratch as below. paylist(U;Q) = : U = _ Q = cons(u1 q1; paylist(U0;Q0)) : otherwise Compare this equation with the zipper algorithm. It is easy to

3052. nd the common structure of the two, and we can parameterize the combinator function as f, so that the `generic' zipper algorithm can be de

3053. ned as the following. zipWith(f; A;B) = : A = _ B = cons(f(a1; b1); zipWith(f;A0;B0)) : otherwise (A.86) Here is an example that de

3054. nes the inner-product (or dot-product)[14] by using zipWith. A B = sum(zipWith(; A;B)) (A.87) It is necessary to realize the inverse operation of zipping, that converts a list of pairs, to dierent lists of elements. Back to the purchasing example, It is quite possible that the unit price information is stored in a association list like U = f(apple; 1:00); (orange; 0:80); (banana; 10:05); :::g, so that it's convenient to look up the price with a given product name, for instance, lookup(melon;U). Similarly, the cart can also be represented clearly in such manner, for example, Q = f(apple; 3); (orange; 1); (banana; 0); :::g. Given such a `product - unit price' list and a `product - quantity' list, how to calculate the total payment? One straight forward idea derived from the previous solution is to extract the unit price list and the purchased quantity list, then calculate the inner-product of them. pay = sum(zipWith(; snd(unzip(P)); snd(unzip(Q)))) (A.88)

3055. 602 APPENDIX A. LISTS Although the de

3056. nition of unzip can be directly written as the inverse of zip, here we give a realization based on folding-right. unzip(L) = foldr((a;b);(A;B) (cons(a;A); cons(b;B)); (; );L) (A.89) The initial result is a pair of empty list. During the folding process, the head of the list, which is a pair of elements, as well as the intermediate result are passed to the combinator function. This combinator function is given as a lambda expression, that it extracts the paired elements, and put them in front of the two intermediate lists respectively. Note that we use implicit pattern matching to extract the elements from pairs. Alternatively this can be done by using fst, and snd functions explicitly as p;P (cons(fst(p); fst(P)); cons(snd(p); snd(P))) The following Haskell example code implements unzip algorithm. unzip = foldr (a, b) (as, bs) ! (a:as, b:bs) ([], []) Zip and unzip concepts can be extended more generally rather than only limiting within linked-list. It is quite useful to zip two lists to a tree, where the data stored in the tree are paired elements from both lists. General zip and unzip can also be used to track the traverse path of a collection to mimic the `parent' pointer in imperative implementations. Please refer to the last chapter of [8] for a good treatment. Exercise A.6 Design and implement iota (I) algorithm, which can enumerate a list with some given parameters. For example: { iota(:::;N) = f1; 2; 3; :::;Ng; { iota(M;N) = fM;M + 1;M + 2; :::;Ng, Where M N; { iota(M;M + a; :::;N) = fM;M + a;M + 2a; :::;Ng; { iota(M;M; :::) = repeat(M) = fM;M;M; :::g; { iota(M; :::) = fM;M + 1;M + 2; :::g. Note that the last two cases demand generate in

3057. nite list essentially. Con-sider how to represents in

3058. nite list? You may refer to the streaming and lazy evaluation materials such as [5] and [8]. Design and implement a linear time imperative zipper algorithm. Realize the zipper algorithm with folding-right approach. For the purchase payment example, suppose the quantity association list only contains those items with the quantity isn't zero, that instead of a list of Q = f(apple; 3); (banana; 0); (orange; 1); :::g, it hold a list like Q = f(apple; 3); (orange; 1); :::g. The `banana' information is

3059. ltered because the customer doesn't pick any bananas. Write a program, taking the unit-price association list, and this kind of quantity list, to calculate the total payment.

3060. A.9. NOTES AND SHORT SUMMARY 603 A.9 Notes and short summary In this appendix, a quick introduction about how to build, manipulate, transfer, and searching singly linked list is briefed in both purely functional and imper-ative approaches. Most of the modern programming environments have been equipped with tools to handle such elementary data structures. However, such tools are designed for general purpose cases, Serious programming shouldn't take them as black-boxes. Since linked-list is so critical that it builds the corner stones for almost all functional programming environments, just like the importance of array to imperative settings. We take this topic as an appendix to the book. It is quite OK that the reader starts with the

3061. rst chapter about binary search tree, which is a kind of `hello world' topic, and refers to this appendix when meets any unfamiliar list operations.

3062. 604 APPENDIX A. LISTS

3063. Bibliography [1] Richard Bird. Pearls of Functional Algorithm Design. Cambridge Uni-versity Press; 1 edition (November 1, 2010). ISBN: 978-0521513388 [2] Simon L. Peyton Jones. The Implementation of Functional Programming Languages. Prentice-Hall International Series in Computer Since. Prentice Hall (May 1987). ISBN: 978-0134533339 [3] Andrei Alexandrescu. Modern C++ design: Generic Programming and Design Patterns Applied. Addison Wesley February 01, 2001, ISBN 0- 201-70431-5 [4] Benjamin C. Pierce. Types and Programming Languages. The MIT Press, 2002. ISBN:0262162091 [5] Harold Abelson, Gerald Jay Sussman, Julie Sussman. Structure and In-terpretation of Computer Programs, 2nd Edition. MIT Press, 1996, ISBN 0-262-51087-1 [6] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [7] Fethi Rabhi, Guy Lapalme. Algorithms: a functional programming ap-proach. Second edition. Addison-Wesley, 1999. ISBN: 0201-59604-0 [8] Miran Lipovaca. Learn You a Haskell for Great Good! A Beginner's Guide. No Starch Press; 1 edition April 2011, 400 pp. ISBN: 978-1-59327- 283-8 [9] Joe Armstrong. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf; 1 edition (July 18, 2007). ISBN-13: 978-1934356005 [10] Wikipedia. Tail call. https://en.wikipedia.org/wiki/Tail call [11] SGI. transform. http://www.sgi.com/tech/stl/transform.html [12] ACM/ICPC. The drunk jailer. Peking University judge online for ACM/ICPC. http://poj.org/problem?id=1218. [13] Haskell wiki. Haskell programming tips. 4.4 Choose the appropriate fold. http://www.haskell.org/haskellwiki/Haskell programming tips [14] Wikipedia. Dot product. http://en.wikipedia.org/wiki/Dot product 605

3065. GNU Free Documentation License Version 1.3, 3 November 2008 Copyright c 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. http://fsf.org/ Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The purpose of this License is to make a manual, textbook, or other func-tional and useful document free in the sense of freedom: to assure everyone the eective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modi

3066. cations made by others. This License is a kind of copyleft, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. 1. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The Document, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as you. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law. A Modi

3067. ed Version of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modi

3068. cations and/or translated into another language. 607

3069. 608 BIBLIOGRAPHY A Secondary Section is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The Invariant Sections are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not

3070. t the above de

3071. nition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none. The Cover Texts are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Doc-ument is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A Transparent copy of the Document means a machine-readable copy, represented in a format whose speci

3072. cation is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for draw-ings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent

3073. le format whose markup, or absence of markup, has been arranged to thwart or discour-age subsequent modi

3074. cation by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not Transparent is called Opaque. Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML us-ing a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modi

3075. cation. Examples of transparent image for-mats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only. The Title Page means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, Title Page means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. The publisher means any person or entity that distributes copies of the Document to the public. A section Entitled XYZ means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a speci

3076. c sec-tion name mentioned below, such as Acknowledgements, Dedications, Endorsements, or History.) To Preserve the Title of such a sec-

3077. BIBLIOGRAPHY 609 tion when you modify the Document means that it remains a section Entitled XYZ according to this de

3078. nition. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no eect on the meaning of this License. 2. VERBATIM COPYING You may copy and distribute the Document in any medium, either commer-cially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. 3. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to

3079. t legibly, you should put the

3080. rst ones listed (as many as

3081. t reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Doc-ument well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.

3082. 610 BIBLIOGRAPHY 4. MODIFICATIONS You may copy and distribute a Modi

3083. ed Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modi

3084. ed Version under precisely this License, with the Modi

3085. ed Version

3086. lling the role of the Document, thus licensing distribution and modi

3087. cation of the Modi

3088. ed Version to whoever possesses a copy of it. In addition, you must do these things in the Modi

3089. ed Version: A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission. B. List on the Title Page, as authors, one or more persons

Elementary algorithms

More Related Content

What's hot (15)

Viewers also liked (13)

Similar to Elementary algorithms (20)

Recently uploaded (20)

Elementary algorithms