New Features P3: Memory Management Scopes

Intro

It looks like the next build of 4coder (4.0.29) is going to be ready sometime in the next few weeks. The new build has been in development for a couple months and is loaded to the brim with new features that have all gone through interesting architectural and algorithmic design that I believe are worth sharing for several reasons. One it will prepare 4coder users who want to start writing customizations for how to think about the new features. Two it will help anyone who likes to think about the processes of architecture and algorithm design with some examples of my own process. Three it might expose me to some criticisms or suggestions that could help me improve the specifics of the new 4coder features before I put them into the wild.

Directory of all parts:

Memory Management Overview
Memory Management Variables and Objects
Memory Management Scopes
Custom UIs and Various Layers for Lister Wrappers
Custom Cursors, Markers, and Highlights, and the Render Caller

Scopes Basics

If you're not caught up with part 1 and part 2, go back and get caught up, because this part will make no sense without that context. Scopes are all about two things, managing lifetimes and keying values you want to store. Both of these terms deserves more elaboration. By "managing lifetimes" I mean a few things. There is the fact that the memory put on the scope is freed when the scope closes, thus preventing memory leaks, but there's a bit more to it too. Since scopes contain variables, we can close and reopen a scope to free it's contents and set all it's variables back to default. In other words managing lifetimes is about doing bulk actions that clear a state back to default.

By "keying values" I am talking about the method you use to store and retrieve a value. One big difference between an object and a variable in this system is that the two are keyed very differently. Variables are keyed by a scope and a compile time constant, while objects are keyed by runtime generated handles. These differences have all sorts of implications. If you have an function that does several variable operations inside a scope, that function only needs to be parameterized on the scope, whereas the same operation on several objects would have to be parameterized on each object handle. Of course we can construct all sorts of other scenarios where we would prefer not to have to have a scope and variable name and would rather just manipulate everything through a few runtime handles. The point of this is whenever we put something into a scope it's not just about tieing the lifetime of the allocation to the scope so that it will free at the appropriate time, but we're also setting ourselves up to be able to find that information when, and only when, we want to be able to get it back.

The core provides scopes tied to the lifetime of buffers, scopes tied to the lifetime of views.

1 2	Managed_Scope buffer_get_managed_scope(Application_Links app, Buffer_ID buffer_id); Managed_Scope view_get_managed_scope(Application_Links app, View_ID view_id);

User Scopes

Tieing scopes to buffers and views does a lot of work, but once I started getting used to scopes I realized that module authors could have all sorts of reasons to create scopes themselves so that their own concepts with lifetimes could also have all the advantages of the scoping system. For instance suppose you build a little debugger integration into 4coder and you have a command that explicitly begins a debug sessions and another that explicitly ends the session. Those would be perfect times to start and end a managed scope, so that all the data you create and store along the way that is specifically relevant to the debugging session, but not relevant without it, can be on that scope. For this purpose I introduce a pair of calls for creating and destroying user managed scopes.

1 2	Managed_Scope create_user_managed_scope(Application_Links app); bool32 destroy_user_managed_scope(Application_Links app, Managed_Scope scope);

Note that introducing this destroy call has the possibility to confuse users who think they can or should destroy scopes from the core like the view and buffer scopes. However I didn't want to introduce a different type for user scopes because, besides this one quirk, everything else a scope can do is universal between core scopes and user scopes: variable set, variable get, object allocate, and all of the scope operations to come in this post.

Bulk Clear - Heaps in Heaps

I've used the phrase "bulk" several times now, and this is the first part of the system that is algorithmically interesting. When I say bulk, I am not just saying that a whole bunch of individual frees are done automatically. That would be "bulk" in the sense that one call triggers a bunch of work for convenience, but I actually mean "bulk" in the sense that one call achieves a bunch of work by a more efficient method than you could achieve without going bulk. First, let's consider a scope as basically having two main allocation systems, the allocation for variable storage and the allocation for object storage. Variables are stored in a sparse lookup table that is allocated in a single contiguous block and can be freed in one operation no matter how many variables in the scope were actually modified from their default value.

Bulk freeing allocation for objects is the more interesting case. Recall that memory objects are allocated with a variable size and support a free operation:

1
2

Managed_Object alloc_managed_memory_in_scope(Application_Links *app, Managed_Scope scope, int32_t item_size, int32_t count);
bool32 managed_object_free(Application_Links *app, Managed_Object object);

In 4coder this is all based on a very simple heap style allocator, but the implementation type and optimization details are actually not important to the concept, an allocator that supports the following operations is all we need to alloc, free, and bulk free our objects:

struct Allocator{ /*off topic*/ };
Allocator make_empty_allocator();
void allocator_extend_available_memory(Allocator *allocator, void *mem, int32_t size);
void *alloc(Allocator *allocator, int32_t size);
void free(Allocator *allocator, void *ptr);

The keys here are that we have an Allocator handle in each operation, so that we can build multiple allocators and easily parameterize allocations on the allocator, and that we can extend an allocator an arbitrary amount while knowning that it does not automatically extend itself. Then what we can do is have a source of memory at the top of the system which can be an Allocator or anything else, and then we can build a separate allocator for each scope. Whenever we allocate an object on the scope, we go through the allocator assigned to that scope, so all of the alocations belong to a set of larger contiguous ranges, and if we fail to get the memory we need we can go to the larger allocation system at the top of the system and get some more memory to extend the scope's allocator. Finally when the time comes to free all objects we can iterate each page that we got from the top and free it back to the top, we don't have to iterate all the little objects and free each one individually. If we do a good job of putting lots of objects on each page we have greatly reduced the amount of freeing work we have to do on a bulk free operation.

With this design I have found that it almost never takes more than two frees to the top to destroy any scope (often less than two because scopes don't allocate anything for variables or objects before they have to).

Bulk clears and resets are so useful that sometimes I think we will want to do them without destroying the scope, and sometimes we will want to do a bulk clear on a core scope that we can't destroy, and the API supports this:

1	bool32 clear_managed_scope(Application_Links *app, Managed_Scope scope);

Scopes with Multiple Dependencies (Turning the wacky and wild dial to eleven)

It turns out that scopes as described so far, while they solve a lot of useful problems, also feel like they fall just short of fully solving the problem in a lot of cases. Let's look at the case of building the sticky jumps system in 4coder. Sticky jumps is the name for the system that powers jumping to error and jumping to items in a search list. They are "sticky" because once the jumps are parsed, the positions are marked with markers which will follow that text around as the buffer is edited. This means that jump lines and positions remain correct even as the text about the position change. We will look at how markers operate in more detail in part 5, but for now the important point is that markers are only relevant while the buffer they mark still exist and while the buffer that contains the jump lines still exists, because in 4coder the way you trigger a jump needs the jump buffer even after the markers for sticky jumps have been placed. So what do we do?

If we place the markers in the scope of the destination buffer, they will be freed automatically when that buffer is closed, but if the jump buffer is closed it has to manually free all the markers it created, and by the way jump buffers are closed and reopend all the time, for instance every time you rebuild the compilation buffer, which can have jumps, is closed and reopened. If we place the markers in the scope of the jump list buffer, we just flip the problem around without solving it, now the jump lists which are frequently closed get the bulk automatic free, but all other buffers in the entire system now have to take themselves out of all the lists out there whenever they close. We can't just leave markers in the scope of the jump list and hope the jump list gets closed and reopened soon, because as long as that info sits there, someone might try to jump to it even though the system should know that that's impossible now, it's not just a leak to leave something around it's a bug!

// Bad scenario one: store markers in the scope of desitnation buffer
BUFFER_CLOSE_HOOK(sticky_jump_cleanup_hook){
    Jump_List *list = lookup_jump_list(buffer_id);
    if (list != 0){
        for (int i = 0; i < list->count; i += 1){
            managed_object_free(app, list->marker_handle[i]);
        }
        free_jump_list(list);
    }
    // By the way, when we close buffers that have
    // markers attached, do we just leave the stale
    // handle in the jump list, or... ???
}
// Bad scenario two: store markers in the scope of the jump buffer
BUFFER_CLOSE_HOOK(sticky_jump_cleanup_hook){
    List_Of_Lists_That_Refer_To_A_Buffer *list_of_lists = lookup_list_of_lists(buffer_id);
    if (list_of_lists != 0){
        for (int i = 0; i < list->count; i += 1){
            Jump_List *list = lookup_jump_list(list_of_lists->lists[i].list_buffer_id);
            if (list != 0){
                int index = list_of_lists->lists[i].my_sub_index;
                managed_object_free(app, list->marker_handle[index]);
                list->marker_handle[index] = 0;
            }
        }
        free_list_of_lists(list_of_lists);
    }
}

I hope you can feel where this is going. We want a scope where we can place information and we want that scope to be dependent on two different buffers. When I started to feel this problem I decided to take some time to do some architectural investigation and figure out what it was I was really wanting. In this case all I knew about the API was I wanted a new way to get a Managed_Scope I couldn't confidently say more than that because I wasn't sure what was feasible, so I started thinking about the rest of the problem from the implementation end. Should I support some kind of "pairs of scopes" system? Or should I just go full arbitrary sets of dependencies? Should I create a user scope and set the dependency, or should the core own the scopes with multiple dependencies? I started by asking if the best case scenario was feasible: Arbitrary sets of dependencies, automatically managed by the core.

You might question whether "automatically managed by the core" is best case scenario, because it appears to remove some felxibility from the system. Maybe a user wants to create and close the scope manually and also set it to automatically close in other conditions! But I thought "core managed" was better because I wanted to rely on the scope's capacity for keying. For example, if I have a pair of buffers that I used to get a scope with multiple dependencies, and then I store a variable into that scope, I want it to be trivial to get that variable again the next time I happen to have the same pair of buffers. If I used user created scopes, then although the scope would have the right life time properties, the pair of buffers would not be a key that retrieves that scope, which is a feature I knew I would need.

CREDITS: Andrew Chronister, the Chronal Dragon, deseves the credit for coming up with the heart of this system. I had already conceived of the problem and endeavored to evaluate the best possible solution, and before I discussed it with Andrew, my best solution wasn't very good. He provided the key insight that I needed to support this feature. In the process we used to reach the solution we rephrased the problem into simpler components that capture the actual heart of the problem. In this case we have "objects" which are things like buffers and views, which we will name with capital letters A, B, C and "keys" where a key is a set of objects. All possible sets of objects are valid keys.

The connection back to scopes is that each possible key will have a scope associated with, and the objects are the things on which scopes are dependent. When you get the scope for a view or buffer, and when you create a user scope, you are getting a "basic scope" which is a scope associated to a "basic key" which is a key with one dependency.

Examples of basic keys:

1 = {A}
2 = {B}
3 = {C}
4 = {D}

A scope with multiple dependencies is a scope associated with a key with multiple elements. In other words keys like the following:

5 = {A, B}
6 = {A, C}
7 = {A, D}
8 = {A, B, C}

When we want a scope dependent on two different buffers we will get the scopes for each individual buffer, then pass the scopes into a call that builds the union of their keys and uses the new key to lookup and return the scope with multiple dependencies.

The tricky part of this is making sure we have enough information to do all the necessary cleanup work when we destroy an object. Obviously keys are constructed as a set of objects, so one option is that when we destroy object A, we iterate all keys and destroy any key that contains A, removing it from whatever accelerated lookup structures we use. But I didn't like this because I was expecting there to be a lot of keys. After all there is automatically one for each buffer and view, then for any union of basic keys that has been used at least once, those keys exist too. The key insight that I needed to put this whole thing together were following the steps for creating keys and destroying objects, described below in psuedo code:

struct Key{
    struct Object **objects;
    int object_count;
};
struct Object{
    // NOTICE: the object stores the list
    // of keys that currently contain it
    struct Key **keys;
    int key_count;
    int key_max;
};

Object* create_object(){
    return(AllocAndClearToZero(Object))
}

Key* get_key(Object **objects, int object_count){
    SortPointers(objects, object_count);
    object_count = RemoveNeighborDuplicates(objects, object_count);
    Key *key = LookupKeyInLookupStructures(key);
    if (key == 0){
        key = AllocAndClearToZero(Key);
        key->objects =
            AllocArrayAndCopy(Object*, object_count, objects);
        key->object_count = object_count;
        // NOTICE: when we create a key we iterate the
        // member objects and insert ourself into the
        // object's key list
        for (int i = 0; i < object_count; i += 1){
            StretchBufPush(objects[i]->keys,
                           objects[i]->key_count,
                           objects[i]->key_max, key);
        }
        InitObjectsAttachedToKey(key);
        InsertKeyIntoLookupStructures(key);
    }
    return(key);
}

void internal__destroy_key(Key *key, Object *cause_object){
    for (int i = 0; i < key->object_count; i += 1){
        // no point in doing anything to the cause object
        // it is about to free itself
        Object *object = key->objects[i];
        if (cause_object == object) continue;
        // we remove key from this object
        // TODO: accelerate to avoid linear search?
        for (int j = 0; j < object->key_count; j += 1){
            if (object->keys[j] == key){
                object->key_count -= 1;
                Swap(Key*, object->keys[j],
                     object->keys[object->key_count]);
                break;
            }
        }
    }
    CleanupAnythingAttachedToKey(key);
    RemoveKeyFromLookupStructures(key);
    Free(key->objects);
    Free(key);
}

void destroy_object(Object *object){
    // NOTICE: we have the set of all keys that
    // contain object and we iterate it to destroy
    // all dependent keys
    for (int i = 0; i < object->key_count; i += 1){
        internal__destroy_key(object->keys[i], object);
    }
    Free(object->keys);
    Free(object);
}

The parts not in CamelCase constitute the heart of the system. We have objects that we create and destroy, and keys that we create under the hood whenever we first try to get them, and then are destroyed when any of their objects are destroyed. Creating a key puts it into all of the appropriate key lists for the appropriate objects. Destroying an object destroys all of it's keys, and destroying a key causes it to remove itself from all of it's objects. I usually don't like letting something get too complicated, and this definitely qualifies as complicated, but now we have the task of comparing it to the alternative.

The alternative is that everytime someone wants a scope with multiple dependencies they start going through the process of building lookup tables so that when an object gets destroyed they can find all of the data structures they need to cleanup. If you scroll back up to Bad scenario one you will see the comment:

1
2
3

// By the way, when we close buffers that have
// markers attached, do we just leave the stale
// handle in the jump list, or... ???

And if you look at Bad scenario two you will see the List_Of_Lists_That_Refer_To_A_Buffer. I get the distinct impression from all of this that this pattern of self refential business, all this "keeping a list of places where there are pointers to me so I can remove them" stuff is just the general nature of this problem. Whether or not this is the simplest or best solution I don't know. What I do know for sure was that going case by case wasn't actually making it easier, the extra context of a specific case wasn't enough to simplify the problem. Maybe someday we'll find a simpler solution for this class of problem, but for now, I think we're all very likely to be rewriting the same pointer webs everywhere. I thought I might as well move all the pointer webing into the core, do it once, and get it right so that we never have to do it again for this class of problem.

All of that thinking and considering boils down into one little call:

1	Managed_Scope get_managed_scope_with_multiple_dependencies(Application_Links app, Managed_Scope scopes, int32_t count);

This call really does a union operation on the scopes you pass in. If you pass the same scope twice, you'll get that scope back:

1 = {A}
1 union 1 = {A} union {A} = {A} = 1

If you pass a scope that was already the result of a union, it will be unioned again:

5 = {A B}
6 = {A C}
8 = {A B C}
5 union 6 = {A B} unkion {A C} = {A B C} = 8

Looking at the scopes in this way, it occurred to me that I was missing the scope for the empty set. Such a scope would represent a "global" scope. It has no member objects, so it requires no key to acquire it; when you're thinking about keying "requires no key to acquire" means global. This scope also had the unique property that nothing would ever be able to destroy it, since it is non-basic it has to be destroyed by destroying an object, but no object ever can destroy it. So, quite appropriately, the global scope is a scope that lasts forever (well until you close 4coder, as if that ever happens :D). You could acquire the global scope with get_managed_scope_with_multiple_dependencies by passing in no scopes, but go make life easier the API provides the call:

1	Managed_Scope get_global_managed_scope(Application_Links *app);

This scope, being associated with the empty key, has the unique property that it has no effect when it's key is unioned to other keys:

0 = {}
4 = {D}
0 union 4 = {} union {D} = {D} = 4

This might seem confusing, but it is actually quite appropriate. The global scope always exists, so it should be meaningless to say that something depends on the lifetime of the global scope.

Finally, just as it is useful to do bulk clears of single scopes, it also strikes me as occasionally useful treat a basic scope as an identifier for the single object it contains, and clear all the scopes that depend on the same object as the basic scope.

1	bool32 clear_managed_scope_and_all_dependent_scopes(Application_Links *app, Managed_Scope scope);

A Scope with Multiple Dependencies Example (Adjusting to the new whacky world we just created)

Okay that was a lot, let's see how this thing looks in action. Let's return our thoughts to the sticky jumps problem, and solve it with this new power. We can now put the markers into a scope that is dependent on both buffers. We still need a single list of all the object handles spread out across all those scopes, so that we can find what we're looking for from just the jump buffer. To this end we will create an array that stores information per-jump-line. The information will include the id of the destination buffer, a subindex that tells us that this is the Nth jump from this list that goes into this buffer, and the line number of the jump line in the jump buffer.

Triggering a jump looks something like this:

void jump_to_line_at_cursor_nearest(Application_Links *app){
    View_Summary view = get_active_view(app, AccessAll);
    Managed_Scope jump_scope = buffer_get_managed_scope(app, view.buffer_id);
    Managed_Variable_ID jump_array_varid = managed_variable_create_or_get_id(app, "STICKY_JUMPS.jump_array", 0);
    uint64_t value = 0;
    if (!managed_variable_get(app, jump_scope, jump_array_varid, &value)){
        return;
    }
    if (value == 0){
        return;
    }
    Managed_Object jump_array = (Managed_Object)value;
    int count = managed_object_get_item_count(app, jump_array);
    Jump_Data *jumps = AllocArrayAndClearToZero(Jump_Data, count);
    managed_object_load_data(app, jump_array, 0, count, jumps);
    int best_index = find_best_jump(jumps, count, view.cursor.line);
    Jump jump = jumps[best_index];
    Managed_Scope dest_scope = buffer_get_managed_scope(app, jump.dest_buffer_id);
    if (dest_scope == 0){
        return;
    }
    Managed_Scope scope_array[2];
    scope_array[0] = jump_scope;
    scope_array[1] = dest_scope;
    Managed_Scope marker_scope = get_managed_scope_with_multiple_dependencies(app, scope_array, 2);
    Managed_Variable_ID marker_array_varid = managed_variable_create_or_get_id(app, "STICKY_JUMPS.marker_array", 0);
    if (!managed_variable_get(app, marker_scope, marker_array_varid, &value)){
        return;
    }
    if (value == 0){
        return;
    }
    Managed_Object marker_array = (Managed_Object)value;
    Marker marker = {0};
    if (!managed_object_load_data(app, marker_array, 0, 1, &marker)){
        return;
    }
    DoJump(app, jump.destination_buffer_id, marker.pos);
    Free(jumps);
}

If that looks like we've had to do more work here than we would have done without managed variables and objects and scopes everywhere, you're absolutely right. This code would be half as long if all the memory was allocated and managed directly by the custom side. However by doing the extra work to pass all memory management through this system, we are now completely free from having to write any freeing code for markers and lists, and while this is a lot of code, I find it much easier to get right than the "list of lists with pointers to me" type code.

We still leave stale handles behind, but now instead of those stale handles being pointers that could refer to memory that has been given a new purpose or taken out of the valid address space, now stale handles are gracefully handled by the API which checks and indicates that the handle has gone stale allowing us to prevent a crash and possibly even take appropriate actions in the absense of a viable jump destination.

Thanks for reading through on this whirlwind of a post! Next something much lighter: the entire UI system and series of layers of wrappers for listers, see you then!

New Features P3: Memory Management Scopes

Comments