Skip to content

Commit 72d3a9e

Browse files
committed
started the documentation, insertion is done
1 parent ace49d4 commit 72d3a9e

File tree

3 files changed

+126
-5
lines changed

3 files changed

+126
-5
lines changed

.gitignore

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
*.out
22

3-
// github.com/hruthik0x/bintree - used for debugging
4-
bintree files
3+
// github.com/hruthik0x/disptree - used for debugging
4+
disptree files
5+
doc_files

README.md

+8-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# Push Down - Chain
22
**Owner** : [Hruthik0x](https:/github.com/hruthik0x)
33

4+
This is a heap implementation, built as an attempt to use linked list structure in heap while working as fast as an array implementation of heap.<br>
5+
Objective : Build a heap implementation that is as fast as array implementation, along with the ability to work in fragmented memory environments.
6+
47
Run this to see benchmarks :
58

69
`g++ benchmark.cpp heap.c utility.c arrayHeap.h -o a.out && ./a.out`
@@ -39,6 +42,8 @@ Combines the best of both - Speed and ability to work efficiently in fragmented
3942
Uses linked list (binary tree) to take care of fragmented memory scenarios.
4043

4144
- Uses "push down" instead of "bubble up" like the classic heap implementation (both array and linked list) do.
42-
- Does not insert the element at the end and then bubble it up, instead inserts the element at the top, and the pushes it down along a specific path which leads to the correct location (Location at which, inserting maintains complete binary tree property)
43-
- Time taken to find this appropriate path is **always** directly proportional to height of the tree O(log(n)).
44-
- Uses a different approach to find the last leaf node element during deletion whose time complexity is always (constant) **O(1)**
45+
- Does not insert the element at the end and then bubble it up, instead inserts the element at the top, and the pushes it down along a specific path which leads to the correct location (Location at which, inserting maintains complete binary tree property)
46+
- Time taken to find this appropriate path is **always** directly proportional to height of the tree O(log(n)).
47+
- Uses a different approach to find the last leaf node element during deletion whose time complexity is always (constant) **O(1)**
48+
49+
Read more about the internal implementation here.

internal_working.md

+115
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
To make a the new heap implementation, we will use existing implementations and modify them.
2+
We have two choices :
3+
1) Make array implementation work with fragmented memory
4+
2) Make linked list implementation faster.
5+
6+
It is not possible to go with 1) as, the arrays need contigous block of memory, so we are left with 2), that is, make linked list implementation faster.
7+
8+
## Insertion
9+
10+
In a binary heap ( tree-node structure ), a new node is usually added as the left most child of the first available leaf node in order to maintain the complete binary tree property.
11+
Finding this position requires a level-order traversal (BFS), which takes O(n) time complexity, since we may need to examine all nodes to find the appropriate insertion point.
12+
13+
What if we can bring down the time complexity for finding the appropriate position ??
14+
15+
### Finding the path
16+
Look at the following binary tree :
17+
18+
22, 18,19, 14,8,17,15 13,11,5,2,<br>
19+
In order to maintain the complete binary tree property, the next element has to be inserted at position `P`, `L = 3` and `C = 4` **(`L` stands for level and `C` stands for column).**<br>
20+
Upon closer inspection, we also find that, inorder to reach the position P, we have to go traverse from root in the following order
21+
22+
`root -> right -> left -> left` i.e `Right, Left, Left`
23+
24+
If I represent right as 1 and left as 0, then it would be `100` which is binary representation of `C` in `L` bits, i.e 4 in 3 bits.
25+
This means **if** I know `C` and `L`, I can get to the appropriate position in O(log n), as I can simply traverse the path by checking every bit of `C` represented in binary in `L` bits.<br>
26+
**This is true for all `C` and `L`.**
27+
28+
### Managing C and L
29+
- `C` and `L` represent the path to the position where the **next** node is to be inserted.
30+
- They **do not** represent the path to the position of the current right most leaf node in last level.
31+
- We store `C` and `L` in root Node.
32+
- So our root node is going to be different than the rest of the nodes, as it would have `L` and `C` along with `data`, `right` and `left` (data is the number stored in root node, right and left are the addresses of the left and right node).
33+
- Initially when there is only root node, `L = 1` and `C = 0`.(`L = 1`, next node to be inserted will be level 1 and `C = 0`, it'll be to the left).
34+
- Every time an element is inserted `C` is incremented, after incrementing `C`, if `C` is greater than 2<sup>L</sup> - 1, `C` is reset to 0 and `L` is incremented.
35+
36+
Example (Root is considered as the 0<sup>th</sup> element) :
37+
- Initially `L = 1` and `C = 0` when there is only root node.
38+
- For the 1<sup>st</sup>, `C` is incremented to 1 and since C <= 2<sup>L</sup> - 1 (1 <= 1) `L` remains same.
39+
+ Result : `L = 1` , `C = 1`
40+
- For the 2nd element `C` is incremented to 2 but C > 2<sup>L</sup> - 1 (2 > 1) so `L` is incremented and `L` remains same.
41+
+ Result : `L = 2` , `C = 0`
42+
- For the 3rd element c is incremented, and since C <= 2<sup>L</sup> - 1 (0 <= 2) `L` remains the same.
43+
+ Result : `L = 2` , `C = 1`
44+
45+
Now that we found the appropriate position, we can simply put the new element here, and bubble it up like we generally do in binary heap.
46+
47+
So that makes the complexity `O(log n) + O(log n)` [ `O(log n)` for finding the appropriate position and `O(log n)` for bubble up ].<br>
48+
In the classic heap implementation with binary tree structre, the time complexity would be `O(n) + O(log n)` [ `O(n)` for finding the appropriate position and `O(log n)` for bubble up ]
49+
50+
Can we make it better than `O(log n) + O(log n)` ?
51+
52+
### Push-Down
53+
What if we perform `Push-Down` instead of the traditional `Bubble-Up` ?
54+
55+
- Let the number we are going to insert be Q
56+
- As we go down the path we generated using `C` and `L` we check if we can replace any number in this path is smaller than Q (for max heap) or larger than Q (for min heap), and then swap this number with Q.
57+
- We continue going down the path and re-arrange the numbers in this manner.
58+
- **In short**, we are treating Q as the parent node and comparing all the nodes along the path as child nodes and then performing appropriate swaps (heapification).
59+
60+
Example :
61+
62+
Let Q be 20.
63+
Since `C` and `L` are 4 and 3 respectively, the path is Right, left, left (`100` : `C` represented in binary as `L` bits)
64+
- First we check if Q is larger than root.
65+
In this case its not, so we continue down the path.
66+
- We go right : 1, here 19 is smaller than 20, so both are swapped, Now Q is 19.
67+
- We go left : 0, here 17 is smaller than 19, so both are swapped, Now Q is 17.
68+
- We go left : 0, insert the Q (17) here, as this is the last direction / bit.
69+
70+
Now we are performing `Push-Down` instead of `Bubble-Up`.
71+
72+
**Note** :
73+
74+
Basically, we are performing heapification from top to bottom (`Push-Down`) instead of performing it from bottom to top (`Bubble up`) while carefully choosing certain nodes to rearrange in the tree, in such a manner that the tree holds the complete binary tree property.
75+
76+
Here we treat the element to be inserted ( Q in the above example ) as the parent.
77+
78+
While performing heapification from top to down, we will face two choices, when a parent node is smaller than both of its children (In max heap).
79+
We do not know which child to choose to swap with.
80+
We have to choose a child in such a manner that, full binary tree proprty is maintained.
81+
The decision of choosing the appropriate child is accomplished using `C` and `L`.
82+
83+
Since we are performing swapping/numbers (Heapify) simultaneously along with traversing the path, the new complexity is O(log n) instead of O(log n) + O(log n).
84+
85+
## Deletion
86+
Number in root node is returned
87+
Last number (Number in the left most leaf node) will be swapped with the number in the root node and the left most leaf node will be pruned.
88+
89+
Step 0 : Return root->num
90+
Step 1 : Find the right most leaf node in last level : O(n)
91+
Step 2 : Swap the root with lead node.
92+
Step 3 : Delete the lead node.
93+
Step 3 : Heapify. (Starting from root). : O(log n)
94+
95+
Total complexity = O(n) + O(log n)
96+
97+
### No-chain
98+
We can get the O(n) in finding the right most leaf node in last level to O(log n) by using `L` and `C` as explained in the Insertion section.
99+
Which makes the complexity to O(log n) + O(log n) [Finding + Heapify]
100+
101+
This implementation uses same amount of memory as a traditional linked list implementation.(Except the root node, as it stores `L` and `C`).<br>
102+
Implementation is not in main brance, its in `No-Chain` branch
103+
104+
Can we bring down the complexity of finding the right most leaf in last level node to O(1) from O(log n) ?
105+
106+
### Chain
107+
Note : This implementation uses more memory per node than a traditional binary tree implementation (Stores one address in addition to a normal node).
108+
109+
We can chain/connect all the non-leaf nodes with a linked list, where the head (Called as `Last Parent`) of the linked list is the parent of the last leaf node.
110+
The address of `Last Parent` is stored in the root.
111+
112+
As I keep inserting elements, I chain the parents.
113+
114+
For example :<br>
115+
Root Node : a

0 commit comments

Comments
 (0)