DATA STRUCTURES
Weight balanced binary tree
BODJE NKAUH NATHAN-REGIS M.PHIL COMPUTER SCIENCE CHRIST UNIVERSITY, BANGALORE-INDIA 1
Content Page
ntroduction Chapter :
Binary rees Foundation
04
A-
efinition
B-
ypes of binary trees
C-
Binary ree raversals
04 04 05 06
-
Chapter ii :
Balanced tree Foundation
07
A-
efinition
B-
erformance of balanced binary tree
07 08 09 10
C-
Chapter iii:
eferences
Big-
notation
Glossary
Weight balanced binary tree
11
A-
efinition
B-
roperties of weight-balanced tree
C-
ethods of weight-balanced tree
11 11 11 11 12 13
-
weight-balanced tree in depth
-
Advantages of weight-balanced tree
F-
Chapter iv:
Glossary
mplementation of weight -balanced tree
weight Balanced Binary ree in depth
15
A-
efinitions
B-
heorem
C-
erformance
15 16 16
17
2
IN
C ION
Data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. There are many books describing data structures around the world. Different kinds of data structures are there, and some are highly specialized to specific tasks. In Our work we have mission to report information about weight
balanced binary tree . The binary search tree (BST) data structure is fundamental to computer science. Many variants of BSTs have been devised so far. Weight BSTs achieve logarithmic worst-case cost by using th size of the subtrees as balancing information [ 13]. So let us in this report talk about Weight BST This report is subdivided in four sequences. Firstly we are going to present foundation and generalization in the tree structures. The second sequence of our work will talk about the balanced tree and the two last blocs should give more information about the main topic of this work.
3
Ch pt
i: Bin
T
s ound tion
A- D finition A bin t is a finite set of e e ents that is eithe e into three disjoint s sets [1]
ty or is partitione
- The first s set contains a single ele ent called the root of the tree - The other two s sets are the selves binary trees called the left and right s btrees of the original tree A left or right s btree can be e pty Eac h ele ent of a tree is called a node of the tree
There is a basic operation we can do in the tree: Searching, Insertion, Deletion, Traversal and Sort. [2]
B- T p s of bin
t
s
A root d bin ry tree
A full bin ry tree [proper bin ry tree or 2-tree or stri tly bin ry t ree]
A perf ect bin ry tree
A co plete bin ry t ree
An infinite co plete bin ry tree
A b l nced bin ry t ree *
A rooted co plete bin ry tree
A degenerate tree
Note that this terminology often varies in the literature, especially with respect to the meaning " complete" and "full". [2]
4
C- Binary ree raversals A binary tree can be traversed using different algorithms [ 3] Let L,
V, and R stand for moving left, visiting the node, and moving right.
There are six possible combinations of traversal lRr, lrR, Rlr, Rrl, rRl, rlR Adopt convention that we traverse left before right, only 3 traversals remain lRr, lrR, Rlr inorder, postorder, preorder For better understanding [4]
M is the Root Node P is the Left Child Node N is the Right Child Node + Pre-Order MPN + In-Order PMN + Post-Order PNM + Level-Order MPN
1. In-order Left-Root-Right. Listing method in which the sequence of the nodes is left most child node, root node followed by remaining child nodes in a left to right order.
2. Post-order Left-Right-Root. The sequence of the nodes is child notes in left to right followed by the root. 3. Pre-order Root-Left-Right, It employs Depth First Search. The sequence of the nodes is root node followed by the child nodes taken in left to right order .
4. level-order, where we visit every node on a level before going to a lower level. This is also called Breadth-first traversal.
5
C- Glossary Tree
- a non-empty collection of nodes & edges Node - can have a name and carry other associated information i-1 T he maximum number of nodes on level i of a binary tree is 2 , i>=1. k T he maximum number of nodes in a binary tree of depth k is 2 -1, k>=1. Path - a list of distinct nodes in which successive nodes are connected by edges. Any two nodes must have one and only one path between them, else it is not a tree A directed edge refers to the link from the parent to the child Root - starting point (top) of the tree Parent (ancestor) of a node A - the node "above" node A Child (descendent) of a node A - the node "below" node A Siblings - nodes that have same parent Leaf (terminal node) - a node with no children Level of a node - the number of edges between this node and the root Depth of a node - the number of edges from the root to that node. The depth of the root is 0. Height of a node - the largest number of edges from that node to a leaf. The height of each leaf is zero Height of a tree - the longest path from the root to a leaf node. The size of a node is the number of descendants it has including itself. In-degree of a node is the number of edges arriving at that node. Out-degree of a node is the number of edges leaving that node
6
Chapter ii: Balanced Tree Foundation A- Definition A balanced tree is a tree which is balanced - it has roughly the same height on each of its sub-nodes. A balanced tree will have the lowest possible overall height [7]. A binary tree where no leaf is more than a certain amount farther from the root than any other [8 ]. A balanced tree is a sorted collection of key/value pairs optimized for searching and traversing in order .
A balanced tree can be divided by two (three) [8]: Height
balanced : AVL trees, 2-3 trees, 2-3-4 trees, B trees
Weight condition : BB[a] -Bäume (*) Structural conditions : Bruder-, 2-3-, a-b-, B-Bäume A
balanced binary tree is a tree that is explicitly kept balanced. It has a lookup
and insertion complexity of O(log(n)), and an accumulative complexity of O(n*log(n)). All of them worst case! It turned out to be faster than the sorted array, by about a factor of 2. For more (maximum) efficiency, a binary search tree stand be balanced.
7
B- Performance of balanced binary tree [10] Binary search trees provide O( lg n) performance on average for important operations such as item insertion deletion and search operations Balanced trees provide O(lg n) even in the worst case
Consider the binary tree, at each level the number of the nodes is doubled. There are three levels, and the total number of nodes is : 1+
2 + 22 + 23 = 24 - 1 = 15 Note that 24 - 1 = 2*23 1 | also that 3 = log(23)
In general, if we have a tree with M levels, the number of the nodes would be : 1+
2 + 22 + . 2 Let
=
2(M+1) - 1 = 2*2M 1
N be the number of the nodes.
We will find now how M (the depth of the tree) depends on the number of the nodes.
N = 2*2M 1 --- > 2*2M = N + 1 --- >2M = (N+1)/2 M = log((N+1)/2) = O(logN)
Thus, given a balanced binary tree with N nodes, the height of the tree is O(
log(N)). Given a balanced binary tree with M levels, the number of the nodes is O(2M)
Average height of a binary tree : O(sqrt(N)) - computed by considering all possible cases of a binary tree with N nodes.
Worst case: when the tree is a list - all nodes except the leaf have only one child: height: N-1.
8
C- Big-O notation
Definition:
A theoretical measure of the execution of an algorithm, usually the
time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n). The notation is read, "f of n is big oh of g of n".
Formal Definition: f(n) = O(g(n)) means there are positive constants c and k, such that 0 f(n) cg(n) for all n k. The values of c and k must be fixed for the function f and must not depend on n. [ 16]
T he
importance of this measure can be seen in trying t o decide whether an
algorithm is adequate, but may just need a better implementati on, or the algorithm will always be t oo sl ow on a big enough input. For instance, quicksort, which is O(n l og n) on average, running on a small deskt o p c omputer can beat bubble sort, which is O(n²), running on a supercomputer i f there are a l ot of numbers t o sort.
T o
sort 1,000,000 numbers, the quicks ort takes
20,000,000 steps on average, while the bubble sort takes 1,000,000,000,000 steps! [17]
9
D- Glossary Binary Search T ree is binary tree and the following additional property: Given
a node T, each node to the left is smaller than T
And each node to the right is large
Some e amples of search- tree data structures are [12] : Red-blac
trees and splay trees, both of which are instances of self -
balancing binary search trees; Ternary searc h trees, in whi ch each internal node has e actly three
children; B trees, commonly used in databases; B+ trees, like B trees but with all data values stored in the leaves; van Emde Boas trees , ve ry efficient if the data values are fixed -size integers.
(a) is not a BST. (b) and (c) are BSTs
10
Chapter iii: weight Balanced Binary Tree A- Definition A weight-balanced binary tree is a binary tree which is balanced based on knowledge of the probabilities of searching for each individual node. Within each subtree, the node with the highest weight appears at the root. This can result in more efficient searching performance [11]. Also known as BB() tree [ 18]
B- Properties of weight-balanced tree [13]Because the tree is weight-balanced, the distances between any node and each of the leaf node descendents of that node are equal. So, for any leaf nodes xy
C- Methods of weight-balanced tree Weight balanced tree is balanced binary search trees b ecause their weight is used as the criterion for balancing. The weight of the tree is defined as the number of external nodes in the tree (this equals the number of null pointer in the tree) [ 1]. If the ratio of the weight of the left subtree of every node to the weight of the subtre rooed at the node is between some fraction a and a-1, the tree is a weight balanced tree of ratio a or is said to be in the class wb[ a].
11
D-
Advantages of weight-balanced tree
A weight-balanced binary tree has several advantages over the other data structures for large aggregates: In addition to the usual element-level operations like insertion, deletion and lookup, there is a full complement of collection-level operations, like set intersection, set union and subset test, all of which are implemented with good orders of growth in time and space. This makes weightbalanced trees ideal for rapid prototyping of functionally derived specifications. An element in a tree may be indexed by its position under the ordering of the keys, and the ordinal position of an element may be determined, both with reasonable efficiency. Operations to find and remove minimum element make weight -balanced trees simple to use for priority queues. The implementation is functional rather than imperative. This means that operations like `inserting' an association in a tree do not destroy the old tree, in much the same way that (+ 1 x) modifies neither the constant 1 nor the value bound to x. The trees are referentially transparent thus
the programmer need not worry about copying the trees. Referential transparency allows space efficiency to be achieved by sharing subtrees. These features make weight-balanced trees suitable for a wide range of applications, especially those that require large numbers of sets or discrete maps. Applications that have a few global databases and/or concentrate on element-level operations like insertion and lookup are probably better off usin g hash tables or red-black trees. A weight-balanced tree takes space that is proportional to the number of associations in the tree .
12
E-
Implementation of weight-balanced tree
The following table shows a simple BB[alpha ] tree (with alpha=1/4) and the balances of the nodes. [ 19]
1.
TREE TRAV ERSAL
refers to the process of visiting (examining and/or updating) each node in a tree data structure.
Algorithm traversal Parameter : root and key Initialization : r receive the root L receive the number of leaves n receive r.key , r.left and r.right l determine by floor((1-a)*L) Work: Repeat of these instruction If r is null return null o o Else if key = n.key then return n Else if key < n.key then r receive n.left o o Else r receive n.right Notes o we obtain a simpler remainder function W(pl) <= (1 - alpha) W(p) W(pr) <= (1 - alpha) W(p) (W(p) denotes the number of leaves of p, p is a node, pl and pr its left and right child, respectively.)
2.
can be accomplished using the method : checking for the separate cases with no children, two children, or one. TREE DELETION
Algorithm delete Parameter : root and key Initialization : t receive the root 13
Work:
o o o
o o
o
o
o o o
Progressing in the tree If t is null return null Else if t.key < key then return t.right receive delete (t.right,key) Else if t.key > key then return t.left receive delete(t.left,key) Deletion if descendant is null Else if t.left = null then t receive t.right Else if t.right = null then t receive t.left Deletion if descendant is not null Else if wt(t.left)>wt(t.right) then [t=rrot(t) ; t.right = delete(t.right,key)] Else [t=lrot(t) ; t.left = delete(t.left,key)] Reconstruct weight information If t not null then t.weight receive wt(t.left)+wt(t.right) check(t) Return(t) // end of program
Notes o o o
wt is function wich give the weight of node rrot and lrot is respectly right and left rotation check controlled if the t is weight balanced
We have try to write essential implementation of weight balanced binary tree in this section, you can find other function for adaptation in WBBT here : a)http://www.cse.iitb.ac.in/~as/mit-scheme/scheme_ 12.html#SEC 127 b) http://www-users.aston.ac.uk/~beaumoaj/AJBcs 124/DSA/Page2. 3.html
Chapter iv: weight Balanced Binary Tree in depth Let Alpha be a real such that 1/4 < Alpha <= 1 Let T be a binary search tree Let Tl (Tr ) be the left (right) subtree of the root of T Definitions
The root balance of T, R ho , is given by: R ho(T) = | Tl | / |T | = 1 - | Tr | / | T | where | T | = number of leaves of tree T A tree T is of bounded balance Alpha , if for every subtree T' of T we have: Alpha <= R ho(T' ) <= 1 - Alpha BB[ Alpha ] is the set of all trees of bounded balance Alpha 14
Example
of BB[ Alpha ] tree Alpha <= 1/3
Node
Root balance
a
½
b c d
2/5 5/14 2/3
Let bi denote the depth of node xi in tree T. The average path length of T is given by:
Theorem
Let T be a BB[ Alpha ] with n nodes. Then a)
b) if then theorem 2 states P <= 1.15(1 + 1/n) lg( n+1) - 1 and height(T) <= 2 lg( n +1 ) ± 1
Performance
Lemma 15
Let Let e a BB Al ] t ee. Let T be result of addi (deleti a node from T'. Assume t at Tl and Tr are BB Al ] trees. Assume t at T is no longer a BB Al ] tree. Then a single or double rotation ill balance T Th
r m.
Let . Then there is a constant c such that the total number of single and double rotations required in a sequence of M inser tions and deletions into an empt BB Al ha ] tree is <= cM For = 3/ the prove gives the value of c = 19.59 xper iments suggest that c is near 1
Th
Let tree
r m.
. Then inser t access, delete takes O( ) in a BB ith
Al
ha ]
leaves
It has been shown that the cost W of a weight bal anced binary tree satisfies the inequal ities, H 5 W 5 H + 3 , where H is the entropy of the set of the l eaves. For a c l ass of smooth distributions the inequal ities, H 5 W 5 H + 2 , are derived. These resul ts impl y that for sets with l arge entropy the search times provided by such trees cannot be substantiall y shortened when binary decisions are being used. J. Rissanen: Bounds for Weight Bal anced Trees 16
References
and Books reviews
[1] Dat a Str t re Usi g C and C ++ 2nd Ed iti on; ISBN 81 -317-0328-2 [3] Univer sit o f N ort T ex as , CSCE 3110 Da t a Str t res & Algo rit m Analysi s [14] Abstr ac t , Sal v ador Roma, htt ://
k.com/ content/ d9191wj e6 j4r r 4b / spri nger li n
[15 ] Handbook o f DATA STRUCTURES and APPLICATIO N S; Ed ited by D i nesh P. Meht a Color ado School o f Mi nes Gold en and Sart aj Sahni Univer sit y o f Flori da Gai nesvi ll e
gr ammi ng P ear ls: Algorit hm Desi gn T echni es, CACM, 27(9) : 8 68, [17] J on Bent le y, P ro Se ptember 1984 f or an ex am pl e o f a mi c r ocom puter r unni ng B ASIC b eati ng a super com puter r unni ng FORTRAN . [2] htt p://en.w i k p i ed ia .or g / w i ki /Bi nar y_tree [4] htt p:// www.laynet wor ks.co m/ cs04.htm [ 5 ] htt p:// sites.googl e.com/ site/ sumedhshend e/ bi nar y trees
ed ia .or g / wi k i/Bi nar y_sear ch_tree [ 6 ]htt p://en.w i ki p [7] htt p:// wi k i. answ er s.com/Q/ hat _i s_a_balanc ed_tree _i n_dat a_str uc t ures [8] : htt p:// www.csie.nt u.edu.t w /~ wcchen / algorit hm/ balanc eT ree/ balanc ed.htm [9] htt p:// www.it l.ni st .gov/ d iv 897 / sqg / dads / HTML / balanc ed bitr .html [10] htt p://f acul ty .sim pson.edu / lyd i a.si napov a / www / c msc250 / LN 250_W ei ss / L 07T rees.htm [11] htt p://en.w i ki p ed ia .or g / wi ki / W ei ght -balanc ed_tree
ed ia .or g / wi ki / Sear ch_tree [12] htt p://en.w i ki p [13] htt p:// planetm at h.or g /e ncycloped ia / W ei ghtBalanc ed Bi nar y T reesAreUl tr ametri c.html [16 ] htt p:// www.it l.ni st .gov/ di v 897 / sqg / dads / HTML / bi gOnot ati on.htm l [18] htt p:// www.it l.ni st .gov/ di v 897 / sqg / dads / HTML / bbalphatree.htm l [19] htt p:// www.aut o.t uw ien.ac.at/~ bl ieb / woop / bbalpha.htm l [20] htt p:// www.el i .sdsu.edu / cour ses /f all95/ cs 660 / note s /BB/BBtrees.html
17