Pairing Heaps with Costless Meld
Abstract
Improving the structure and analysis in [1], we give a variation of the pairing heaps that has amortized zero cost per meld (compared to an in [1]) and the same amortized bounds for all other operations. More precisely, the new pairing heap requires: no cost per meld, per find-min and insert, per delete-min, and per decrease-key. These bounds are the best known for any self-adjusting heap, and match the lower bound proven by Fredman for a family of such heaps. Moreover, our structure is even simpler than that in [1].
1 Introduction
The pairing heap [5] is a self-adjusting heap that is implemented as a single heap-ordered multi-way tree. The basic operation on a pairing heap is the linking operation in which two trees are combined by linking the root with the larger key value to the other as its leftmost child. The following operations are defined for the standard implementation of the pairing heaps:
-
•
find-min. Return the value at the root of the heap.
-
•
insert. Create a single-node tree and link it with the tree of the heap.
-
•
decrease-key. Decrease the value of the corresponding node. If this node is not the root, cut its subtree and link the two resulting trees.
-
•
meld. Link the two trees representing the two heaps.
-
•
delete-min. Remove the root of the heap and return its value. The resulting trees are then combined to form a single tree. For the standard two-pass variant, the linkings are performed in two passes. In the first pass, called the pairing pass, the trees are linked in pairs from left to right (pairing these trees from right to left achieves the same amortized bounds). In the second pass, called the right-to-left incremental-linking pass, the resulting trees are linked in order from right to left, where each tree is linked with the tree resulting from the linkings of the trees to its right. Other variants with different delete-min implementation were given in [2, 3, 5].
The original analysis of the pairing heaps [5] showed an amortized cost for all operations. Another self-adjusting heap that requires amortized cost per operation [11] is the skew heap. Theoretical results concerning the pairing heaps were later obtained through the years. Stasko and Vitter [12] suggested a variant that achieves amortized cost per insert. The bounds for the standard implementation were later improved by Iacono [7] to: per inset, and zero cost per meld. Fredman [4] showed that amortized comparisons, in the decision-tree model, would be necessary per decrease-key operation for a family of heaps that generalizes the pairing heaps. Pettie [10] proved amortized costs of: per delete-min, and for other operations. Recently, Elmasry [1] introduced a variant that achieves the following amortized bounds: per insert, per delete-min, and per decrease-key and meld. See Table 1.
insert | delete-min | decrease-key | meld | |
---|---|---|---|---|
Fredman et al. [5] | ||||
Stasko and Vitter [12] | ||||
Iacono [7] | zero | |||
Pettie [10] | ||||
Elmasry [1] | ||||
This paper | zero |
Several experiments were conducted on the pairing heaps, either comparing its performance with other priority queues [8, 9] or with some of its variants [2, 3, 12]. Such experiments illustrate that the pairing heaps are practically efficient and superior to other heaps, including the Fibonacci heaps [6].
In this paper, we give a variation of the pairing heaps that achieves the best known bounds for any self-adjusting heap for all operations. Namely, our amortized bounds are: zero cost per meld, per find-min and insert, per delete-min, and per decrease-key. We describe the data structure in Section , prove the time bounds in Section , give possible variations in Section , and conclude the paper with some remarks.
2 The data structure
Similar to the standard implementation of the pairing heaps, we implement our variation as a single heap-ordered multi-way tree.
Since we perform the decrease-key operations lazily, a pointer to the minimum element is maintained.
The detailed implementations for various heap operations are as follows:
-
•
find-min. Return the value of the node pointed to by the minimum pointer.
-
•
insert. Create a single-node tree and link it with the main tree. Update the minimum pointer to point to this node if it is the new minimum.
-
•
decrease-key. Decrease the value of the corresponding node . Update the minimum pointer to point to if it is the new minimum. Add to the list of decreased nodes if it is not a root.
We use the following procedure in implementing the upcoming operations:
-
-clean-up:
-
i.
Perform the following for every node in the list of decreased nodes: Cut ’s subtree and the subtree of the leftmost child of . Glue the subtree of the leftmost child of in place of ’s subtree, and add the rest of ’s subtree (excluding the subtree of ’s leftmost child that has just been cut) to the pool of trees to be combined. See Figure 1.
Figure 1: A cut performed by the clean-up procedure. -
ii.
Arbitrary divide the trees of the pool into groups of trees each (except possibly for one smaller group). For every group, sort the values of the roots of the trees and link the resulting trees in this order such that their roots form a path of nodes in the combined tree (make every root the leftmost child of the root with the next smaller value). Link the combined trees with the main tree in any order.
-
i.
-
•
meld. Call clean-up for the smaller heap. Link the trees of the two heaps. Destroy the smaller heap. Update the minimum pointer to point to the root if it has the minimum of the melded heap.
-
•
delete-min. Call clean-up. Apply the standard two-pass implementation of the pairing heaps [5]. Make the minimum pointer point to the root of the resulting tree.
3 Analysis
We prove the following theorem that implies the claimed time bounds:
Theorem 1
Starting with an empty heap, consider a sequence of operations . Let is a meld operation, is a find-min or an insert operation, is a decrease-key operation, and is a delete-min operation. The sequence can be executed on our pairing heaps in , where is the number of elements that are in the heap at operation and will leave the heap while performing .
For the sake of the analysis, we categorize the nodes as follows. A node is black if it will remain in the heap after performing the sequence of operations under consideration, otherwise it is white. A black node whose descendants are all black is called an inactive node. Let be the number of white descendants of a node , including if it is white.
-
1.
Inactive nodes: Every node with .
-
2.
Active nodes: Other nodes.
To bound the cost of the heap operations, we use a combination of the potential function and the accounting methods [13].
3.1 The potential function
Consider the link between a node and its parent . Let be the number of white descendants of restricted to the subtrees of the right siblings of , including if it is white. We use the potential function
Despite the fact that the potential on a link may reach , the sum of potentials on a path from a node to any of its descendants telescopes to at most . If the path is the left spine of the subtree of , the sum of potentials telescopes to exactly .
3.2 Debits
Consider the following two cases:
-
•
a white node is inserted in a heap with an active root.
-
•
two heaps with active roots are melded.
To fulfill the potential requirements, units are borrowed from the allowable cost for the delete-min operations that will be performed on the white nodes. The following lemma illustrates that these debits are enough to cover the above two cases.
Lemma 1
Consider the heap at any time during the execution of the sequence of operations . Let is a delete-min operation that will be performed on a node currently in the heap. The sum of the potentials on the links formed by insert or meld operations is at most , where is the number of elements that are in the heap at operation and will leave the heap while performing .
Proof. Let be a tree representing a heap that has white nodes at this point of time. Let be the set restricted to the operations performed on the nodes of , and be the sum of the potentials on the links of formed by insert or meld operations. We prove by induction the stronger fact that . Since all the white nodes will eventually be deleted, then . Consider an insert operation, where a white node is linked to resulting in the tree . The required potential on this link is . By induction, . Consider a meld operation, where two trees and with active roots are linked resulting in tree . Assume that and have white nodes, respectively. The required potential on this link is at most . By induction, . This follows from the fact that , for any integers .
3.3 Credits
We maintain the following credits in addition to the potential function:
-
-
Decrease credits: credits for every decreased node since the previous clean-up is performed.
-
-
Heap credits: credits per heap, where is the size of this heap.
-
-
Active-parent credits: credits for every child of an active node.
-
-
Active-run credits: credits for every active node with an inactive right sibling.
3.4 The time bounds
Next, we analyze the time bounds for our operations. Each operation must maintain the potential function, the credits, and pay for the work it performs.
3.4.1 find-min
No potential or credit changes are required. The actual work of find-min is . It follows that the worst-case cost of find-min is .
3.4.2 insert
If the inserted node is white, extra potential units may be needed. But, as Lemma 1 illustrates, these units are borrowed from the logarithmic cost per delete-min, and the insert operation need not pay for that.
Assume that as a result of the insert operation node is linked to node . If is active, the active-parent credits need to be increased by . If is active, and the previous leftmost child of was inactive, the active-run credits need to be increased by . Since the size of the heap increased by one, the heap credits need to be increased by . The decrease credits need to be increased by per decreased node, which still sums up to as indicated by the following proposition.
Proposition 1
.
Proof. For ,
But , where is the base of the natural logarithm.
The actual work to link an inserted node with the main tree is . It follows that the amortized cost of insert is .
3.4.3 decrease-key
No potential changes are required. The decrease-key pays credits for the decreased node. The actual work it performs is . It follows that the amortized cost of the decrease-key operation is .
3.4.4 clean-up
First, consider the effect of a cut performed on a decreased node :
Consider the path of nodes from the root including all the ancestors of followed by the nodes on the left spine of ’s subtree. Since we cut the subtree of and replace it with the subtree of its leftmost child, the nodes of the above path remain the same except for . If all the descendants of are black, possibly excluding the subtree of its leftmost child, then the potentials on all the links do not change as a result of the cut. Otherwise, all the ancestors of before the cut are active. In such case, the proof given in [1]can be applied, indicating that the sum of the potential on all the links does not increase.
If and both its left and right siblings are active while its leftmost child is inactive, then the number of active-runs increases by one, and credits would be needed and paid for from the released decrease credits.
Second, consider the effect of combining the trees and linking them with the main tree:
The trees of a group are combined by sorting the values in their roots and linking them accordingly in order. Since the size of a group is , the actual work done in sorting is paid for from the released decrease credits ( credits per node). This will result in a new path of links. Since the sum of the potential values on a path telescopes, the increase in potential as a result of combining the trees of a group and then linking this group to the main tree is . This potential increase is also paid for from the decrease credits, except for possibly the last group. (The last group may be a smaller group, and its decrease credits may not be enough to pay for the increase in potential.)
As a result of a link the number of active-runs and active-parents may increase by one, and credits would be needed and again paid for from the decrease credits.
It follows that the overall amortized cost of the clean-up procedure is .
3.4.5 meld
As for insert, extra potential units may be needed. But, as Lemma 1 illustrates, these units are borrowed from the logarithmic cost per delete-min.
The cost of the clean-up performed on the smaller heap is , where is its size. Since the size of the combined heap is at most twice the size of the larger heap, the heap credits for the combined heap need to be incremented by . Similar to insert, the active-parent credits and the active-run credits may need to be increased by . The actual work for meld, other than the clean-up of the smaller heap, is . All these costs are paid for from the heap credits of the smaller heap, before it is destroyed.
It follows that the meld operation pays nothing; everything is taken care of by others.
3.4.6 delete-min
We think about the two-pass pairing as being performed in steps. At the -th step, the pair of trees that is the -th pair from the right among the subtrees of the deleted root are linked, then the resulting tree is linked with the combined tree from the linkings of all the previous steps. Each step will then involve three trees and two links. Let be the tree resulting from the linkings of the previous steps, and let be the number of white nodes in . Let and be the -th pair from the right among the subtrees of the deleted root to be linked at the -th step, and let and respectively be the number of white nodes in their subtrees. It follows that . Let denote the tree resulting from the linking of tree to tree as its leftmost subtree. See Figure 2.
We distinguish between four cases, according to the types of the roots of and and who wins the comparison.
-
1.
Both roots are inactive:
There was no potential on the two links that were cut, and no potential is either required on the new links. The actual cost of this step is paid for from the released active-parent credits, as these two roots were children of an active parent and at least one of them is not any more.
-
2.
An active root is linked to an inactive root, and :
The potential that was related to the active root before the operation is enough to cover the potential of the new link with . If the leftmost child of the inactive root was inactive before the link, the active-run credits need to be increased by . As for the previous case, these possibly-needed extra credits and the actual cost of the step are paid for from the released active-parent credits.
-
3.
-
(a)
Both roots are active:
The active-run credits may need to be increased by .
The potential on the two links that are cut at the -th step was
We consider the four possibilities:
-
i.
:
The potential on the new links is
The difference in potential is
-
ii.
:
The potential on the new links is
The difference in potential is
-
iii.
:
The potential on the new links is
The difference in potential is
-
iv.
:
The potential on the new links is
The difference in potential is
-
i.
-
(b)
One root is active and the other is inactive, and :
The active-run credits may need to be increased by .
Since either or equals zero, we use for the other value.
The potential on the cut links is
The potential on the new links is
The difference in potential is
- -
-
If , then . Then, for all the above sub-cases, the change in potential is less than . This released potential is used to pay for the possibly-required increase in the active-run credits, in addition to the actual work done at this step.
- -
-
If , we call this step a bad step. For all the above sub-cases, the change in potential resulting from all bad steps is at most (taking the summation for positive terms only, i.e. ). Since when , the sum of the changes in potential for all steps telescopes to . It remains to account for the actual work done at the bad steps. Since , a bad step results in . Then, the number of bad steps is . It follows that the increase in the active-run credits and the actual work done at bad steps is for each delete-min operation.
-
(a)
-
4.
An inactive root is linked to an active root, and :
The potential that was related to the active root before the operation is enough to cover the potential of the new link with . To cover the actual work done in such step, consider the two steps that follow it. If those two steps are of the same type as this step, the number of active-runs decreases (at least one inactive node is taken out of the way of two active-runs) and such released credits are used to pay for all three steps (this is similar to Iacono’s triple-white notion in his potential function [7]). Otherwise, one of those two steps will pay for the current step as well.
From the above case analysis, it follows that the amortized cost of the delete-min operation is .
4 Variations
The main difference between our implementation and the standard implementation of the pairing heaps is the clean-up procedure. We chose to perform the clean-up before the delete-min operation, and to apply it to the smaller heap before the meld operation. The following variations are as well possible:
-
•
It is possible to periodically perform the clean-up, once the number of decreased nodes reaches following a decrease-key operation. This assures that when the clean-up is performed prior to delete-min operations, there will be only decreased nodes (one group).
-
•
It is possible not to call clean-up prior to meld operations, and to do all the work prior to delete-min operations instead.
-
•
In [4], Fredman stated that the cost of pairing-heap operations, including delete-min operations, is . This bound implies a constant cost for the decrease-key operation when , for any constant . This suggests that, when the number of the decreased nodes is large enough, we perform the clean-up by cutting each of the affected subtrees and directly linking it with the main tree (similar to the standard pairing-heaps implementation).
5 Conclusion
We have given a variation of the pairing heaps that achieves the same amortized bounds as Fibonacci heaps, except for decrease-key (which still matches Fredman’s lower bound for, what he calls [4], a generalized pairing heap). Three important open questions are:
-
•
Is there a self-adjusting heap that achieves amortized decrease-key cost?
-
•
Is it possible that the original implementation of the pairing heaps has the same bounds as those we achieve in this paper?
-
•
Which heap performs better in practice?
References
- [1] A. Elmasry, Pairing heaps with decrease cost. 20th ACM-SIAM Symposium on Discrete Algorithms (2009), pp. 471–476.
- [2] A. Elmasry. Parametrized self-adjusting heaps. Journal of Algorithms 52(2) (2004), pp. 103-119.
- [3] M. Fredman. A priority queue transform. 3rd Workshop on Algorithms Engineering, LNCS 1668 (1999), pp. 243-257.
- [4] M. Fredman. On the efficiency of pairing heaps and related data structures. Journal of the ACM 46(4) (1999), pp. 473-501.
- [5] M. Fredman, R. Sedgewick, D. Sleator, and R. Tarjan. The pairing heap: a new form of self_adjusting heap. Algorithmica 1(1) (1986), pp. 111-129.
- [6] M. Fredman and R. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. Journal of the ACM 34 (1987), pp. 596–615.
- [7] J. Iacono. Improved upper bounds for pairing heaps. Scandinavian Workshop on Algorithms Theory, LNCS 1851 (2000), pp. 32-45.
- [8] D. Jones. An empirical comparison of priority-queues and event-set implementations. Communications of the ACM 29(4) (1986), pp. 300-311.
- [9] B. Moret and H. Shapiro. An empirical assessment of algorithms for constructing a minimum spanning tree. DIMACS Monographs in Discrete Mathematics and Theoretical Computer Science 15 (1994), pp. 99-117.
- [10] S. Pettie. Towards a final analysis of pairing heaps. 46th IEEE Symposium on Foundations of Computer Science (2005), pp. 174-183.
- [11] D. Sleater and R. Tarjan. Self-adjusting heaps. SIAM Journal on Computing 15(1) (1986), pp. 52-69.
- [12] J. Stasko and J. Vitter. Pairing heaps: experiments and analysis. Communications of the ACM 30(3) (1987), pp. 234-249.
- [13] R. Tarjan. Amortized computational complexity. SIAM Journal on Algebraic Discrete Methods 6 (1985), pp. 306-318.