Coursera part 2 introduces a slight variation of the Heap Management algorithm presented in the book.
The significant changes are
- The size field in the segment headers has changed to be the size of the segment's data block instead of the size of the segment itself.
- Segments are always split into two segments when an allocated segment is returned from Memory.alloc().
- The returned block contains a complete header. (In the book's algorithm, allocated segments have size as a single-word header.)
Also note the trivial change that the order of the
link and
size fields has been swapped.
| | Before alloc(5) |
|
freeList | → |
next | → |
next | → |
null |
| |
2 | |
8 | |
3 |
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
|
|
| | After alloc(5) |
|
freeList | → |
next | → |
next | → |
null |
| |
2 | |
1 | |
3 |
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
null | * |
|
| |
| |
5 | |
|
returned block | → |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
* See Allocated block header section, below. |
|
Figure 1: New Heap Management algorithm |
Memory.alloc() changes
The search loop in Memory.alloc() remains the same—using either first-fit or best-fit, find a segment on the free list whose size is greater than or equal to the requested block size.
In the Book algorithm, once a large enough segment is found, the segment is either used in its entirety, or it is split into two segments. If the entire segment is used, the segment must be removed from the free list. This is a non-trivial operation because the search loop must keep track of two pointers during the linked-list search so that the segment that links to the segment to be removed can have its next pointer updated.
The Coursera algorithm vastly simplifies this by always spliting the found segment and using the newly created segment for the allocated block. The found segment is reduced in size but is never removed from the free list. Another useful effect of always splitting segments for allocation is that the free list will never be empty; the freeList pointer will never be null.
Memory.deAlloc() changes
Memory.deAlloc() is simplified because the
size field is already in the correct location and has the correct value. The deallocated segment just needs to be returned to the free list.
Allocated block header
The Coursera algorithm sets the
next pointer in the block header to
null. An improvement would be to set it to a
signature, a known value that is neither
null nor a legal pointer value. This will allow dealloc() to check if the block being deallocated is an allocated block. (Double deallocation is a common programming error.)
The simplest signature scheme is to set the next field to an easily recognizable value like -31416.
An alternative is to set the next field to a value derived from the block size. This would help prevent allocated blocks that had their header overwritten from being placed on the free list and corrupting the heap. Setting the block size to not(block size) will form a suitable value that is an illegal address.
Fragmentation
The Coursera algorithm is more susceptible to fragmentation than the Book algorithm because it always splits the heap segment that is used for a block allocation. This means that a block that is deallocated will be too small to satisfy a request for another block of the same size. A program that is repeatedly creating and disposing the same object will inevitably run out of memory.
Defragmentation
Defragmentation is not required by the course, but may be necessary if you want to run your game using your OS. Here is a fairly easy way to do defragmentation in Memory.deAlloc().
- Insert the deallocated segment into the free list in ascending segment address order.
- After insertion, check if either or both the preceding segment or next segment is contiguous with the newly inserted segment; if so, combine the segments.
Another useful property of the Coursera Memory.alloc() algorithm is that because segments are always split and the unallocated portion remains on the free list, the initial segment—the segment at 2048—will always be present on the free list. By maintaining the free list in address order, the first free segment will always be the 2048 segment and there will be no need for a special case to handle insertion at the head of the list.
To find the highest addressed segment that is less than the segment being deallocated, search the list for the first segment that has a next pointer that null or is greater that the deallocated segment's address.
deAlloc(Array o)
segment = o-2 // Get heap segment from data block address
prev_seg = free_list
next_seg = prev_seg.next
while next_seg != null and next_seg < segment
prev_seg = next_seg
next_seg = prev_seg.next
At this point,
prev_seg is the segment before the insertion point and
next_seg is the segment after the insertion point.
WARNING: next_seg may be null.
| | Before deAlloc(block) |
|
| |
2048 | |
3000 | |
4000 |
freeList | → |
next | → |
next | → |
null |
| |
2 | |
1 | |
3 |
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
3003 | |
| |
| |
sig. | * |
|
| |
5 | |
|
block | → |
| |
|
| |
| |
|
| |
| |
|
| |
| |
|
| |
| |
|
* See Allocated block header section, above. |
|
|
| | Insert into free list |
|
| |
2048 | |
3000 | |
3003 | |
4000 |
freeList | → |
next | → |
next | → |
next | → |
null |
| |
2 | |
1 | |
5 | |
3 |
| |
| |
| |
| |
|
| |
| |
| |
| |
|
| |
| |
| |
|
prev_seg→3000 | |
| |
| |
|
next_seg→4000 | |
| |
| |
|
|
|
| | After defragmentation |
|
| |
2048 | |
3000 | |
4000 |
freeList | → |
next | → |
next | → |
null |
| |
2 | |
8 | |
3 |
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
| |
| |
| |
|
|
Figure 2: Deallocation with defragmentation |
It is simple to insert the deallocated segment into the free list between prev_seg and next_seg
prev_seg.next = segment
segment.next = next_seg
There are now three defragmentation cases that need to be handled.
- The deallocated segment immediately follows prev_seg.
- next_seg immediately follows the deallocated segment.
- All three segments are contiguous.
These three cases can be handled with two sequential operations
// Combine segment with next_seg if contiguous.
if (segment + segment.blockSize + 2) == next_seg
segment.blockSize = segment.blockSize + next_seg.blockSize + 2;
segment.next = next_seg.next;
// Combine segment with prev_seg if contiguous.
if (prev_seg + prev_seg.blockSize + 2) == segment
prev_seg.blockSize = prev_seg.blockSize + segment.blockSize + 2;
prev_seg.next = segment.next;
Results
I wrote a test program that allocates 20 blocks randomly sized 200..600 words. It then runs 10,000 passes randomly deallocating 5 of the blocks then reallocating 5 new blocks. Increasing the block size by 25% (to 250..750) causes an allocation failure.
[1]
Block size 400 ± 50%
Heap after 10,000 passes. |
|
Block size 500 ± 50%
Heap after allocation failure in pass 703. |
2048: free (17) -> 2524
2067: alloc 455
2524: free (32) -> 3147
2558: alloc 222
2782: alloc 363
3147: free (71) -> 3465
3220: alloc 243
3465: free (257) -> 4130
3724: alloc 404
4130: free (205) -> 4555
4337: alloc 216
4555: free (60) -> 4870
4617: alloc 251
4870: free (217) -> 5308
5089: alloc 217
5308: free (24) -> 5858
5334: alloc 522
5858: free (81) -> 7122
5941: alloc 317
6260: alloc 445
6707: alloc 413
7122: free (316) -> 8062
7440: alloc 307
7749: alloc 311
8062: free (64) -> 8896
8128: alloc 381
8511: alloc 383
8896: free (457) -> 9790
9355: alloc 433
9790: free (475) -> 10860
10267: alloc 591
10860: free (577) -> 11939
11439: alloc 498
11939: free (981) -> 13394
12922: alloc 470
13394: free (2962) -> 0
16358: alloc 20
16380: alloc 2
|
|
Allocation failure: 742 requested, largest available block 704.
2048: free (43) -> 2960
2093: alloc 515
2610: alloc 348
2960: free (175) -> 4100
3137: alloc 574
3713: alloc 385
4100: free (74) -> 5243
4176: alloc 334
4512: alloc 729
5243: free (539) -> 6396
5784: alloc 610
6396: free (161) -> 9199
6559: alloc 635
7196: alloc 680
7878: alloc 731
8611: alloc 586
9199: free (287) -> 10723
9488: alloc 525
10015: alloc 706
10723: free (65) -> 11408
10790: alloc 616
11408: free (452) -> 13714
11862: alloc 699
12563: alloc 607
13172: alloc 540
13714: free (706) -> 15760
14422: alloc 692
15116: alloc 642
15760: free (596) -> 0
16358: alloc 20
16380: alloc 2
|
[1] If I did my math right, the mean and standard deviation for the sum of 20 uniform random integers is:
Range | | Mean | | Std. Dev. | | +3 S.D. |
[200, 600] | |
8000 | |
517.687 | |
9553.06 |
[250, 750] | |
10000 | |
646.787 | |
11940.36 |
(Note that the failed test is trying to allocate a total of 11896 words which is about +2.9 sigma.)