*Applies the basic data structures to the management of dynamic memory**Presents alternative ways to structure the collection of unused memory locations**two-way circular lists**the buddy system**indexing methods**Discusses problems that arise when structures share storage**Explains fundamental techniques for reclaiming unused storage**Illustrates data abstraction and functional modularization in the development of three stackless binary tree traversals**Calls attention to the problems of**garbage generation**dangling references*

In languages such as C, arrays are * static* data structures. They do not change size or shape during the execution of a program. Chains, however, do change size, though they don't change shape. They are

Although storage management is viewed here from the perspective of a single program's needs, these same techniques are employed by operating systems to deal with the allocation and reclamation of storage for a group of active programs.

A program can store, in linked representation, a number of binary trees that must be traversed using stacks. If the trees must be stored and traversed simultaneously, it is possible that available storage for the records of the tree nodes and for the entries of the stacks will not be sufficient for the worst case. However, if the available storage can be shared, so that it is used for tree records or for stack entries when needed, then the program might execute correctly.

Nodes that are *in use* because they appear in dynamic data structures

Nodes on the list of available space are said to be * known*. Nodes that are available but not on the list of available space are called

**1**--in use since the node appears on l or t

**2**--not in use but on the available space list

**3**--not in use and not on the available space list, hence garbage

Storage allocation now involves three tasks:

**2. **Delete that node from the list of available space.

**3. **Set a pointer value to point to the node.

An avail-like function, discussed in Chapter 3, will perform these three tasks.

When dealing with variable-size records, which is the usual situation, storage management is complex because of * fragmentation.* The fragmentation problem is as follows. When all records being allocated dynamically are the same size, then as long as unused storage finds its way to the list of available space, it can be reused. However, it is possible with variable-size records that storage for a large record is needed but is not immediately in existence. Suppose each available block of

Fragmentation can also be reduced by creating larger nodes, by adding the freed storage to storage adjacent to it that is already free. This process is called * coalescing*. These policies and ways of organizing must be addressed by designers of compilers or operating systems and by programmers when they create storage management schemes for their own use.

Recall that the more structured or organized the data, the faster the access to required information. At the same time a price must be paid to maintain the organization. The primary considerations for a storage management scheme are the time to satisfy a request for a new node, the time to insert a node on the list of available space, the storage utilization, and the likelihood of satisfying a request. How well these characteristics turn out for a particular storage management scheme depends on the sequence of requests and returns. Mathematical analysis is generally not possible, and simulation must be used to evaluate different schemes. To provide some idea of the trade-offs and possibilities, three fundamental organizations for the available space list and appropriate allocation and reclamation policies are presented next.

The simplest organization for the list of available space is a two-way circular list, with nodes kept in arbitrary order, in order by size, or in order by actual memory address. Two-way lists are convenient for making insertions and deletions. They allow the list to be traversed starting from any list record, and in either direction. A request for a node of size *l* then requires a traversal of the list, starting from some initial node.

Two simple policies for selection of the list node to be deleted to satisfy a request are *first-fit* and *best-fit*. * First-fit* selects the first node encountered whose size is sufficient.

Another organization, called the *buddy system*, provides faster request and return time responses than two-way circular lists. The buddy system requires the heap to be of length 2* ^{m}* for some integer

has as its buddy a node of size 2* ^{k}* with starting address

With the buddy system, lists are never actually traversed to find storage to allocate, although time is spent finding the proper list from which to take the storage. The result is quick response to requests and returns. However, in contrast to two-way lists, this scheme produces significant *internal* fragmentation. * Internal fragmentation* occurs when allocated nodes are larger than requested, resulting in wasted storage residing within a node. This is one of the drawbacks of the buddy system. Studies have been done on similar schemes that allow a greater variety of node sizes to appear. They also tend to be fast and to reduce storage loss due to internal fragmentation.

At this point each of the two approaches, circular lists and buddy systems, seems to solve half the problem. Circular lists using the best-fit method utilize storage very efficiently, causing little internal fragmentation. Buddy systems produce significant internal fragmentation but offer fast allocation by narrowly focusing the search for a proper size node. The question is whether it is possible to achieve good storage utilization and allocation time by combining their features. As you would guess, it is. It can be done by using the appropriate structures. The means to do it is called the * indexing* approach. The indexing approach breaks up the list of available space into collections of two-way circular lists. Each of these circular lists contains nodes of just one size. Each node on the circular list represents a block of consecutive memory locations available for the storage of a record of the size of the node. Assuming that requests for nodes of sizes (say, 1 to 4) are very frequent, an array a contains pointers to each of the four lists corresponding to these sizes. The last array entry contains a pointer to an AVL tree. The nodes of this tree are circular lists, each list corresponding to nodes of specific size that are requested much less frequently. Thus a[2] contains a pointer to a circular list containing four blocks or nodes, each of size 2. In the AVL tree pointed to by a[5], the root node is a circular list containing three blocks of size 10. The tree is ordered on the basis of node sizes. A typical configuration might be as shown in Figure 13.5.

The three techniques considered in this section, and other techniques, are compared extensively in Knuth [1973a] and Standish [1980]. See also Tremblay and Sorenson [1984].

When components of data structures are identical and require significant amounts of storage, it is only natural to attempt to avoid component replication. This means that identical components will all share the same storage, so references to each will always be to the same place. As an example, consider Figure 13.6. The two list-structures ls1 and ls2 share a common sublist. As a result, p1, p3, and p4 point to the shared sublist.

Saying that a node or component of a data structure is to be reclaimed for use means that the storage used for its records is to become known to the avail function. That is, the storage no longer needed by its records is to be inserted on the list of available space. There are two basic methods of reclaiming released nodes, **garbage collection*** and reference counters*

The * garbage collection* method of reclamation is used when the available space list is empty or when the amount of storage that is known to avail becomes less than some given amount. Until that time, whenever a sublist is to be deleted from a data structure, the deletion is accomplished by updating appropriate pointers, but the storage vacated by the sublist's deletion is not made known. Initially, all records are assumed to be unmarked. The garbage collection routine works by traversing each data structure that is in use and "marking" each one of its records as being in use.

Apply garbage collection to the heap of Example 13.4.

The * reference counter method* assumes that the first record of every sublist in use has information giving the number of pointers that point to it. This number is called the

The marking algorithms known are quite elegant. The traversal of the dynamic data structures cannot simply use a stack, since no nodes are known in which to implement the stack. Instead, these algorithms allow the traversals to be accomplished by using tag fields, usually one bit, associated with each node. It is even possible to do the traversals with no tag fields, with a fixed amount of additional storage required. Of course, if we had an infinite amount of storage available, no reclamation would be necessary. Storage management would be trivial! The next section features three traversal algorithms for use in phase I of the garbage collector, or in other applications where storage is at a premium. See Pratt [1975] for more on storage management. The algorithms to be discussed next and generalizations are also given in Standish [1980].

Garbage collection routines require the traversal of dynamic data structures that are more general than binary trees, such as list-structures and more complex lists. Algorithms for their traversal are generalizations of those we consider. For increased clarity, we will restrict our discussion to binary tree traversals and present three techniques that do not use extra storage for a stack.

These algorithms may be written assuming a particular binary tree implementation, such as arrays or pointer variables. Instead, to emphasize data abstraction and functional modularity, the following function, which is independent of the tree implementation, has been chosen. The function is a preorder traversal saving the path from the root on the stack.

preorder(pt)

/* Preorder traverses the binary tree. */

binarytreepointer *pt;

{

binarytreepointer null,p,lptr,rptr;

binarytreepointer setnull(),left(),right(),nextsubtree();

stack s;

null = setnull();

p = *pt;

setstack(&s);

while(p != null)

{

process(pt,p);

lptr = left(p);

if(lptr != null)

{

push(p,&s);

p = lptr;

}

else

{

rptr = right(p);

if(rptr != null)

{

push(p,&s);

p = rptr;

}

else

p = nextsubtree(&s,&p);

}

}

}

For each traversal, the basic concept is described. Then the appropriate implementation for the routines of the procedure is described.

A * threaded tree* implementation of the preorder traversal procedure is shown in Figure 13.11. It resulted from the replacement of the null left and right pointer field values of each node t by a pointer to the predecessor and to the successor, respectively, of that node in an inorder traversal of t. These pointers are called

Routine Task

--------------------------------------------------------------------

setnull( ) determined by the implementation (array or pointer

variables)

setstack(&s) sets s to null

push(p,&s) sets s to p

left(p) returns a copy of leftptr.p if it is not a thread, and a

null value otherwise

right(p) similar to left(p)

nextsubtree(&s,&p) set s to p

while (rightptr.s is a thread),

set s to rightptr.s

return a copy of rightptr.s

Now consider the tree of Figure 13.12. It represents the binary tree t of Figure 13.10 during a * link-inversion traversal* after node.p has been processed. Notice that a path exists in the tree from predp back to the root. Each node has a tag field associated with it, and all tag values are initially zero. Preadp always points to the predecessor of p in t, and every node on the path from predp to the root has its right pointer field pointing to its predecessor if its tag field is 1, and its left pointer field pointing to its predecessor if its tag field is 0. The root of the subtree whose left subtree traversal is complete after the processing of p may be found by following the backward links from predp until a tag value of 0 is encountered. As the path back is followed, a tag of 1 at a predecessor node means we are coming from the node's right subtree. A tag of 0 means we are coming from the left subtree. Again, for node.p this leads to node.p4.

Routine Task

-----------------------------------------------------------------------------

setnull( )determined by the implementation

setstack(&s) set s to null

push(p,&s) if leftptr.p is not null, then

set leftptr.p to s and s to p

else

set tag(p) to 1, set rightptr.p to s and s to p

left(p) returns a copy of leftptr.p

right(p) returns a copy of rightptr.p

nextsubtree(&s,&p) set found to false

while((not found) and (s not null))

if tag(s) = 1, then

set pred to s

set tag(s) to 0, s to rightptr.s,

rightptr.pred to p, and p to pred

else

set pred to leftptr.s and leftptr.s to p

if rightptr.s null, then

set found to true, tag(s) to 1, return

rightptr.s, and set rightptr.s to

pred

else

set p to s, set s to pred

if s = null, then

return null.

In this implementation s plays the role of predp.

It is remarkable that an algorithm has been found which does not require a stack or even tag fields. This is the **Robson traversal****,** the final stackless traversal we consider. Again, we give a preorder version, although it may be used for inorder and postorder traversals as well.

The Robson traversal does not have tag fields available for this purpose. Instead, a stack is kept in the tree itself, which will allow node q to be found. Q is the root of the *most recent* subtree, whose nonnull left subtree has already been traversed and whose nonnull right subtree is in the process of being traversed. A variable top is used, and kept updated, to point to the current node q. A variable stack is used, and kept updated, to point to the *next most recent* subtree whose nonnull left subtree has already been traversed and whose nonnull right subtree is in the process of being traversed. Variables p and predp are used, as in link inversion, to point to the node currently being processed and to its predecessor, respectively. Each stack entry, except the current top entry, is kept in the "rightmost" terminal node of one of the subtrees whose nonnull left subtree has been completely traversed, and whose nonnull right subtree is in the process of being traversed. In Figure 13.14, when node.p is being processed, there are four subtrees whose left subtrees have already been traversed and whose right subtrees are currently being traversed. Their roots are labeled 1 to 4 in Figure 13.14. Nodes 1, 3, and 4 have nonnull left subtrees. Each of these nodes has its left pointer field pointing to its predecessor and its right pointer pointing to the root of its left subtree. Node 2, which has a null left subtree, has its original null value, and its right pointer field is pointing to its predecessor. Nodes 1, 3, and 4 are thus the roots that must have stack entries. Top points to the most recent, node 1. Stack points to the stack entry corresponding to the next most recent root, node 3. This stack entry is kept in the rightmost terminal node of node 1's left subtree. This rightmost node has its left pointer field pointing to the node that contains the next stack entry, and its right pointer field pointing to its corresponding root (node 3 in this case). All stack entries (except the first, pointed to by top) have this same format. In Figure 13.14, the last stack entry points to node 4.

If the node's left subtree is being traversed, then

its left pointer is pointing to its predecessor, and its right pointer is undisturbed.

If the node's right subtree is being traversed, then

if the node has a null left subtree, then

its right pointer is pointing to its predecessor

its left pointer is pointing to its predecessor, and its right pointer is pointing to

the root of the node's left subtree.

Routine Task

-----------------------------------------------------------------------------

setnull( ) determined by the implementation

setstack(&s) sets top, stack, and s to null

push(p,&s) if leftptr.p is not null then

set leftptr.p to s and s to p

else

set rightptr.p to s and s to p

left(p) returns a copy of leftptr.p

right(p) returns a copy of rightptr.p

nextsubtree(&s, &p) set avail to p

set found to false

while((not found) and (s not null))

if top = s, then

save stack in hold, pop the stack by setting

top to rightptr.stack, stack to leftptr.

stack, and leftptr.hold and

rightptr.hold to null. Save leftptr.s in

pred, restore leftptr.s by setting it to

rightptr.s, restore rightptr.s by setting it

to p, set p to s, and s to pred

else

if leftptr.s = null, then

save rightptr.s in pred, restore

rightptr.s by setting it to p, set p to s,

and s to pred

else

if rightptr.s null, then

push an entry for s onto the stack by

setting leftptr.avail to stack,

rightptr.avail to top, stack to

avail, and top to s. Return the value

of rightptr.s, set rightptr.s to p,

and set found totrue

else

save leftptr.s in pred, restore

leftptr.s by setting it to p, set p to s,

and s to pred.

if s = null, then

return null.

This chapter has shown how a list of available space can be maintained to keep track of storage known to be currently unused. Storage may be allocated dynamically, during program execution, by deleting appropriate elements from this list when needed.

The need and means for storage reclamation have also been discussed. Garbage collection and reference counter techniques (or some combination) may be used for this purpose. When a node of a binary tree, a list, or other dynamic data structure is not needed, the programmer frequently knows at what point this occurs. Some high-level languages allow commands to be used by the programmer to signal this occurrence. For example, C provides the function free. Like malloc (and other allocation functions), its parameter will be a pointer type variable. When p points to a node that the programmer knows is no longer needed, free(p) may be invoked. This signals that the storage pointed to by p may be relinquished and reclaimed for use for another dynamically allocated data structure. Exactly how free uses this information is determined by the compiler.

Consider the following disaster segment as an illustration of these dangers.

(2) r = malloc ( sizeof(binarytreerecord));

**1. a. **Consider the Suggested Assignments of Chapter 11. Suppose only 200 memory elements are available for lists to store successors. Given an example of a problem with fewer than twenty objects for which the implementation of Assignment 1 will work but the implementation of Assignment 2 will fail. Assume count and succ have length 20, the lists array of (1) has length 200, and the lists array of (2) is 20 10.

**b. **Can you find such an example for which Assignment 2 works but Assignment 1 fails?

**2. a.** Consider two implementations. First, a total of 50 memory elements in an array are to be dynamically allocated and reclaimed so that a binary tree (in linked representation) as well as a stack may be stored in the array. Second, an array of length 35 is to hold the binary tree, and another array of length 15 is to implement the stack. Each array is dynamically managed. The binary tree is to be preorder traversed using the stack. Give an example of a binary tree, with fewer than 35 nodes, that may be successfully traversed with the first implementation but not the second.

**3. **In Exercise 1, the lists array of (1) functioned as a heap. Why was its storage management so simple?

**4. **In Exercise 2, the arrays functioned as heaps. Why was the first implementation able to do traversals that the second could not do?

**5. **In the first implementation of Exercise 2(b), suppose that, after a node is accessed and processed, it is simply put onto the available space list.

**a. **Is the head of the tree a dangling reference?

**b. **Are fixed-size or variable-size nodes being dealt with?

**6. **This is the same as Exercise 5, except that, after a node is accessed and processed, it is not put onto the available space list. After the binary tree has been traversed, the head of the tree is set to null.

**7. **Why is compaction not necessary with fixed-size nodes?

**8. **Suppose Example 13.3 deals with three binary trees. Each node has an integer valued info, leftptr, and rightptr field. The stack needs elements that are also integer valued. An integer array of length l is used as a heap in which nodes of the binary trees and the stack are stored. Assume one additional field is used for each node to indicate its length. Design a storage allocation scheme and write an appropriate avail function.

**9. a.** Suppose in Exercise 8 that when a node of a binary tree is deleted, its predecessor node will be changed to point to the left successor of the deleted node. Write a function delete that carries out this deletion and also returns the storage used by the deleted node to a list of available space.

**b.** How would the delete function change if reference counters were used as a reclamation scheme?

**10. **Write avail for the two-way circular list implementation for

**a. **Randomly ordered list entries.

**b. **List entries in order by size

**c. **List entries in order by address

**11. **Write avail for the buddy system implementation.

**12. **Write a reclaim function for the buddy system.

**13. **Why will garbage collection take a longer time when many nodes are in use and a shorter time when few nodes are in use?

**14. **What are the disadvantages of reference counters compared to garbage collection?

**15. **Write a function to traverse a list-structure with shared sublists to reclaim storage of a sublist whose reference count has gone to zero.

**16. **Write a function to insert a node in a threaded binary tree t. The node is to be inserted as the node accessed next in an inorder traversal of t after node.pred is accessed.

**17. **Write a function to delete a node in a threaded binary tree t. The deleted node is to be node.p.

**18. **Write a function to return a pointer to the predecessor of node.p in a threaded binary tree t.

**19. **Write a function to read information into the info fields of a binary tree t, implemented using pointer variables. The info fields will contain character strings of up to twenty characters. The input is given in the order that the info fields would be accessed in a preorder traversal of t.

**20. **How will your solution to Exercise 19 change if the binary tree t is implemented as a threaded tree?

**21. **Implement the transformation of Section 7.6.1 using a list implementation for the stack. The binary tree is implemented using pointer variables as a threaded tree. Assume the general tree is given by a two-dimensional array of size 50 50. If the [i,j] entry of the array is 1, then node j is a successor of node i; if the [i,j] entry is 0, then node j is not a successor of node i.

**22. **This is the same as Exercise 21 except that the general tree will be given by a list of successors for each node. These lists should be pointed to by pointer variables, one for each node of the general tree. The binary tree should be a link-inversion tree.

**23. **Suppose input, representing a general tree, is given as a sequence of pairs. A pair i, j implies that node j is a successor of node i. A sentinel pair has its i, j values equal. For example, the general tree of Figure 7.13 might be input as B,F A,C D,J I,K I,L C,H B,E I,M A,D D,I A,B B,G X,X. Write a function that will create the successor lists of Exercise 22.

**24. **Simulate the preorder traversal implementation for the link-inversion traversal of the binary tree of Figure 13.10.

**25. **Simulate the preorder traversal implementation for the Robson traversal of the binary tree of Figure 13.10.

**26. **Why will there always be enough storage in the binary tree, during a Robson traversal, for the stack of root nodes whose left subtrees are nonnull, whose traversal has been completed, and whose right subtrees are being traversed?

**27. **What purpose does the stack of Exercise 26 serve in the Robson traversal?

**28. **Write a function to do an inorder traversal of a threaded binary tree t.

**29. **Write a function to do a link-inversion traversal of a binary tree t assuming the tree is implemented using pointer variables.

**30. **Write a function to do a Robson traversal of a binary tree t assuming the tree is implemented using pointer variables.

**1**. Write and run a function to copy a list-structure with shared sublists. Use pointer variables.

**3**. Why would we not write recursives versions of the three stackless traversals?