4coder»Blog
Allen Webster
Hello 4coder users!

I am writing an update on the upcoming 4.0.31 because it's been in development for a _long_ time and there's a lot happening in this build now.

Since I opened the github issues page I've collected a lot of feature requests and bugs and that alone has been a big source for changes. On top of that there were the four 4coder J4MMs! (tm) that I did with Casey that led to some big shifts in the core organization and feature set. In addition to all of that, I had my own feature list for this build with upgrades to the history system, the UI system, and more built in listers. All of this combined has led to possibly the biggest single 4coder build I've ever made.

OS Changes

As I stated at the beginning of the year I am only planning on doing one OS at a time now, and there have been so many changes to the OS layer that this is looking like a really good call. File change notifications are getting completely reworked to guaranty correctness and to avoid interfering with some corner-case file system behaviors; keyboard coverage has been widely expanded; and I've added a regular wake-up timer; in addition to all of this there a major changes in the OS layer by virtue of the changes to my base allocator and string types which I will discuss more below.

API

There have also been major changes to the custom layer API. I have reconfigured it to work on IDs instead of "Summary"s; and it now explicitly uses the String type for both string input and string output parameters. To make this easier to adapt to I'm providing a "transition" helper file with this upcoming build that implements the old API on the new one as best as it can, everything in the standard custom layer is implemented on the new API now, so all a user has to do to update is disable the transition helper and rewrite any effected routines.

Allocators

That isn't the only big change that will break existing customization layers. I'm also switching off the old Partition linear allocator. This is a helper that was added to 4coder a long time ago, and I've learned a lot since then about how I would rather organize memory allocation. The new system is interesting enough to me to deserve it's own entire blog post some day, so other people can avoid the same mistakes in my old Partition system. I'm also thinking of changing off my Heap allocation system, although that isn't a sure thing yet. There is no transition helper for exactly replacing partition, but there are two new types called Cursor and Arena that are mostly capable of replacing Partition everywhere, depending on the situation.

Strings

There is yet another big change coming down the pipe for the API, which is a complete reboot of the String type and library. Just like with the allocator, I've had a lot of time since the original String type I wrote in 4coder to develop a string system that I prefer a lot more. The new string types are better at handling string building: the old system required either a fixed size array, or a fixed-size linear allocator (the Partition) and prevented intermixing string building with any other allocations on the allocator; the new system can avoid all of these limitations. The new system also supports ASCII, UTF-8, UTF-16, UTF-32 string types with conversions and an "Any" string type for encoding agnostic strings. The encoding feature isn't necessarily a big deal for users in the custom layer who usually only care about ASCII or UTF-8, but better support for these string types will help a lot in the Windows layer where UTF-16 <-> UTF-8 conversions are ubiquitous, and in the buffer processing systems where UTF-8 <-> UTF-32 conversions are common.

Unlike the allocator changes I believe that I will be able to supply a transition helper for strings, but I have to do more work in this direction to see for sure if that is the case.

Transition Helpers

Throughout the history of 4coder many builds have been API breaking in some small way, but with this build I am essentially trying to create the "final" API that will have to last forever, so I am not holding back from remodeling everything. The downside to this is that there will be a lot of work for users with big customization layers to update to this build. To make this transition easier I am keeping a document that lists everything I break, and I will write and release a guide to updating to the new API along with this build.

As I've mentioned throughout this post I've tried to maintain transition helpers that implement the old APIs on top of the new ones, but where I cannot provide such helpers I am hoping the guide will be enough. After this build comes out I expect I will spend a while just helping everyone with the transition before moving on to the next build.

The End

Alright I think I've now shared all the most important points. There are a lot of features in this build that I haven't gone over here, but those aren't really the bigger point. Maybe I'll make another post about that too. Thanks for your patience everyone while I work on all of this!
Allen Webster
Hey everyone, quick update today.

I've been getting more and more issues, and I don't have the time to deal with them all when they come in. Plus I've been trying to manage issue reports on the forums here and in my email inbox. Moving forward I will be trying to move all issues to a new public github issues page:

https://github.com/4coder-editor/4coder/issues

Please go fill it up with all your outstanding most important issues!

-Allen
Allen Webster
It's still right around the start of a new year, making it a great time (psychologically - for me) to organize thoughts and examine progress on 4coder.

Goal
My primary driving goal right now is to get the current version stabilized, build up a few more key features, retire this version and move on to updating the 4coder infrastructure, documentation, and then get on to a new major version.

This is basically the same goal I set last year. I'm not quite on the pace to completing the goal I hoped I would be but I'm also not terribly far off. The biggest problem has been the size of the project growing out of control. Usually my pattern is to alternate between major changes and new features one build, and major cleanup and bug fixes the next build, but the last few builds the cleanup and bug fixes haven't been enough. Problems are coming from all sorts of directions:

  • File change notifications are still problematic on every OS
  • The OpenGL renderer exhibits blinking bugs on Windows
  • Mac dropped support for OpenGL
  • There are different keyboard bugs on every OS
  • Good DPI scaling is becoming a necessity, but works a little differently on each OS
  • The core has a number of bugs relating to configuration that have been hard to nail down
  • There are a few crash bugs in the core that have eluded my testing for the last few builds
  • The customization layer has gotten complex enough that many bug reports just come down to a problem caused by the API complexity rather than a bug in the core or an obvious mistake by the author of the customization code
  • The build and packaging system for the 12 different files I distribute with each build has gotten very hard to maintain (most egregiously the 32-bit Linux versions stopped working silently in 4.0.29 and I couldn't recover them for 4.0.30 thanks to Linux package/dependency/linkage problems)


Stabilizing and fixing everything has to be task number one this year, but just getting a handle on everything has proven to be very difficult. 2018 was the first year I tried to simultaneously support three operating systems for every build all year, and I think that is a part of what has slowed me down so drastically. This gets us to the big change I want to introduce to my work flow for 2019:

New Approach to Builds
Starting with the next build I will not be updating all versions for all OSes with each build. The next few builds will all be bug fixes for Windows. Then after that I will do a round of bug fixes on Mac, and then on Linux.

After an initial phase of focusing on one OS at a time for stabilizing everything, I will evaluate whether I want to continue to stick to one OS at a time or not. I suspect that if I did limit most builds to just update the files for one OS I could probably move a lot more quickly and confidently than I could in 2018. I recognize that this might leave valuable users waiting for a while to get the newest features on the OS they are using, so I want to hear back from anyone who is concerned about this. I think this method will benefit everyone in the long run with more stable builds and faster development and iteration on the core, but if you disagree for any reason or have specific concerns about how this might effect you that could be mitigated, I want to hear about them. I'm still just considering this.

Final New Features
Once I'm past the stabilization problems, I still have a few big changes I want to get into this major version. My to do list for this year is mostly just inherited from last year:

  1. Undo-Redo Upgrade
    The Undo-Redo system is still old, messy, and all locked up in the core. There are three big changes to be made:
    • Expose Undo-Redo through the customization API.
    • Extend Undo-Redo information to support per-buffer and global Undo-Redo.
    • Extend the Undo-Redo with an easy to use grouping mechanism for putting multiple edits under a single group to be undone or redone together.

  2. Event Handling Upgrade
    Last year I did some work on how events are processed in the core and triggered in the custom layer. In that process I formed a vision of a more unified communication channel from the core to the custom layer. Currently the core can trigger a command from a command map, call a hook, and on init call the "get_binding_data" procedure, all of which fit together very differently. I think that both sides, the core and the custom layer, will be better off with a more unified vocabulary of events.
  3. Keyboard Input Handling
    The keyboard handling input right now is poor for non-English users, and is especially bad for any language that relies on IME (or similar) input. In order to really represent input properly the interface used for specifying bindings has to change. I'm putting this off to the end of this major version because of the fact that it will break all custom bindings. Hopefully, anyone who doesn't need the improved input will be able to skip this build.


Conclusion
With that I've explained my current thoughts on the 2019 road map. Questions and comments are encouraged!

Thanks everyone!
Allen Webster
Intro

It looks like the next build of 4coder (4.0.29) is going to be ready sometime in the next few weeks. The new build has been in development for a couple months and is loaded to the brim with new features that have all gone through interesting architectural and algorithmic design that I believe are worth sharing for several reasons. One it will prepare 4coder users who want to start writing customizations for how to think about the new features. Two it will help anyone who likes to think about the processes of architecture and algorithm design with some examples of my own process. Three it might expose me to some criticisms or suggestions that could help me improve the specifics of the new 4coder features before I put them into the wild.

Directory of all parts:
  1. Memory Management Overview
  2. Memory Management Variables and Objects
  3. Memory Management Scopes
  4. Custom UIs and Various Layers for Lister Wrappers
  5. Custom Cursors, Markers, and Highlights, and the Render Caller


Scopes Basics

If you're not caught up with part 1 and part 2, go back and get caught up, because this part will make no sense without that context. Scopes are all about two things, managing lifetimes and keying values you want to store. Both of these terms deserves more elaboration. By "managing lifetimes" I mean a few things. There is the fact that the memory put on the scope is freed when the scope closes, thus preventing memory leaks, but there's a bit more to it too. Since scopes contain variables, we can close and reopen a scope to free it's contents and set all it's variables back to default. In other words managing lifetimes is about doing bulk actions that clear a state back to default.

By "keying values" I am talking about the method you use to store and retrieve a value. One big difference between an object and a variable in this system is that the two are keyed very differently. Variables are keyed by a scope and a compile time constant, while objects are keyed by runtime generated handles. These differences have all sorts of implications. If you have an function that does several variable operations inside a scope, that function only needs to be parameterized on the scope, whereas the same operation on several objects would have to be parameterized on each object handle. Of course we can construct all sorts of other scenarios where we would prefer not to have to have a scope and variable name and would rather just manipulate everything through a few runtime handles. The point of this is whenever we put something into a scope it's not just about tieing the lifetime of the allocation to the scope so that it will free at the appropriate time, but we're also setting ourselves up to be able to find that information when, and only when, we want to be able to get it back.

The core provides scopes tied to the lifetime of buffers, scopes tied to the lifetime of views.
1
2
Managed_Scope buffer_get_managed_scope(Application_Links *app, Buffer_ID buffer_id);
Managed_Scope view_get_managed_scope(Application_Links *app, View_ID view_id);


User Scopes

Tieing scopes to buffers and views does a lot of work, but once I started getting used to scopes I realized that module authors could have all sorts of reasons to create scopes themselves so that their own concepts with lifetimes could also have all the advantages of the scoping system. For instance suppose you build a little debugger integration into 4coder and you have a command that explicitly begins a debug sessions and another that explicitly ends the session. Those would be perfect times to start and end a managed scope, so that all the data you create and store along the way that is specifically relevant to the debugging session, but not relevant without it, can be on that scope. For this purpose I introduce a pair of calls for creating and destroying user managed scopes.

1
2
Managed_Scope create_user_managed_scope(Application_Links *app);
bool32 destroy_user_managed_scope(Application_Links *app, Managed_Scope scope);


Note that introducing this destroy call has the possibility to confuse users who think they can or should destroy scopes from the core like the view and buffer scopes. However I didn't want to introduce a different type for user scopes because, besides this one quirk, everything else a scope can do is universal between core scopes and user scopes: variable set, variable get, object allocate, and all of the scope operations to come in this post.

Bulk Clear - Heaps in Heaps

I've used the phrase "bulk" several times now, and this is the first part of the system that is algorithmically interesting. When I say bulk, I am not just saying that a whole bunch of individual frees are done automatically. That would be "bulk" in the sense that one call triggers a bunch of work for convenience, but I actually mean "bulk" in the sense that one call achieves a bunch of work by a more efficient method than you could achieve without going bulk. First, let's consider a scope as basically having two main allocation systems, the allocation for variable storage and the allocation for object storage. Variables are stored in a sparse lookup table that is allocated in a single contiguous block and can be freed in one operation no matter how many variables in the scope were actually modified from their default value.

Bulk freeing allocation for objects is the more interesting case. Recall that memory objects are allocated with a variable size and support a free operation:
1
2
Managed_Object alloc_managed_memory_in_scope(Application_Links *app, Managed_Scope scope, int32_t item_size, int32_t count);
bool32 managed_object_free(Application_Links *app, Managed_Object object);


In 4coder this is all based on a very simple heap style allocator, but the implementation type and optimization details are actually not important to the concept, an allocator that supports the following operations is all we need to alloc, free, and bulk free our objects:
1
2
3
4
5
struct Allocator{ /*off topic*/ };
Allocator make_empty_allocator();
void allocator_extend_available_memory(Allocator *allocator, void *mem, int32_t size);
void *alloc(Allocator *allocator, int32_t size);
void free(Allocator *allocator, void *ptr);


The keys here are that we have an Allocator handle in each operation, so that we can build multiple allocators and easily parameterize allocations on the allocator, and that we can extend an allocator an arbitrary amount while knowning that it does not automatically extend itself. Then what we can do is have a source of memory at the top of the system which can be an Allocator or anything else, and then we can build a separate allocator for each scope. Whenever we allocate an object on the scope, we go through the allocator assigned to that scope, so all of the alocations belong to a set of larger contiguous ranges, and if we fail to get the memory we need we can go to the larger allocation system at the top of the system and get some more memory to extend the scope's allocator. Finally when the time comes to free all objects we can iterate each page that we got from the top and free it back to the top, we don't have to iterate all the little objects and free each one individually. If we do a good job of putting lots of objects on each page we have greatly reduced the amount of freeing work we have to do on a bulk free operation.

With this design I have found that it almost never takes more than two frees to the top to destroy any scope (often less than two because scopes don't allocate anything for variables or objects before they have to).

Bulk clears and resets are so useful that sometimes I think we will want to do them without destroying the scope, and sometimes we will want to do a bulk clear on a core scope that we can't destroy, and the API supports this:
1
bool32 clear_managed_scope(Application_Links *app, Managed_Scope scope);


Scopes with Multiple Dependencies (Turning the wacky and wild dial to eleven)

It turns out that scopes as described so far, while they solve a lot of useful problems, also feel like they fall just short of fully solving the problem in a lot of cases. Let's look at the case of building the sticky jumps system in 4coder. Sticky jumps is the name for the system that powers jumping to error and jumping to items in a search list. They are "sticky" because once the jumps are parsed, the positions are marked with markers which will follow that text around as the buffer is edited. This means that jump lines and positions remain correct even as the text about the position change. We will look at how markers operate in more detail in part 5, but for now the important point is that markers are only relevant while the buffer they mark still exist and while the buffer that contains the jump lines still exists, because in 4coder the way you trigger a jump needs the jump buffer even after the markers for sticky jumps have been placed. So what do we do?

If we place the markers in the scope of the destination buffer, they will be freed automatically when that buffer is closed, but if the jump buffer is closed it has to manually free all the markers it created, and by the way jump buffers are closed and reopend all the time, for instance every time you rebuild the compilation buffer, which can have jumps, is closed and reopened. If we place the markers in the scope of the jump list buffer, we just flip the problem around without solving it, now the jump lists which are frequently closed get the bulk automatic free, but all other buffers in the entire system now have to take themselves out of all the lists out there whenever they close. We can't just leave markers in the scope of the jump list and hope the jump list gets closed and reopened soon, because as long as that info sits there, someone might try to jump to it even though the system should know that that's impossible now, it's not just a leak to leave something around it's a bug!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// Bad scenario one: store markers in the scope of desitnation buffer
BUFFER_CLOSE_HOOK(sticky_jump_cleanup_hook){
    Jump_List *list = lookup_jump_list(buffer_id);
    if (list != 0){
        for (int i = 0; i < list->count; i += 1){
            managed_object_free(app, list->marker_handle[i]);
        }
        free_jump_list(list);
    }
    // By the way, when we close buffers that have
    // markers attached, do we just leave the stale
    // handle in the jump list, or... ???
}
// Bad scenario two: store markers in the scope of the jump buffer
BUFFER_CLOSE_HOOK(sticky_jump_cleanup_hook){
    List_Of_Lists_That_Refer_To_A_Buffer *list_of_lists = lookup_list_of_lists(buffer_id);
    if (list_of_lists != 0){
        for (int i = 0; i < list->count; i += 1){
            Jump_List *list = lookup_jump_list(list_of_lists->lists[i].list_buffer_id);
            if (list != 0){
                int index = list_of_lists->lists[i].my_sub_index;
                managed_object_free(app, list->marker_handle[index]);
                list->marker_handle[index] = 0;
            }
        }
        free_list_of_lists(list_of_lists);
    }
}


I hope you can feel where this is going. We want a scope where we can place information and we want that scope to be dependent on two different buffers. When I started to feel this problem I decided to take some time to do some architectural investigation and figure out what it was I was really wanting. In this case all I knew about the API was I wanted a new way to get a Managed_Scope I couldn't confidently say more than that because I wasn't sure what was feasible, so I started thinking about the rest of the problem from the implementation end. Should I support some kind of "pairs of scopes" system? Or should I just go full arbitrary sets of dependencies? Should I create a user scope and set the dependency, or should the core own the scopes with multiple dependencies? I started by asking if the best case scenario was feasible: Arbitrary sets of dependencies, automatically managed by the core.

You might question whether "automatically managed by the core" is best case scenario, because it appears to remove some felxibility from the system. Maybe a user wants to create and close the scope manually and also set it to automatically close in other conditions! But I thought "core managed" was better because I wanted to rely on the scope's capacity for keying. For example, if I have a pair of buffers that I used to get a scope with multiple dependencies, and then I store a variable into that scope, I want it to be trivial to get that variable again the next time I happen to have the same pair of buffers. If I used user created scopes, then although the scope would have the right life time properties, the pair of buffers would not be a key that retrieves that scope, which is a feature I knew I would need.

CREDITS: Andrew Chronister, the Chronal Dragon, deseves the credit for coming up with the heart of this system. I had already conceived of the problem and endeavored to evaluate the best possible solution, and before I discussed it with Andrew, my best solution wasn't very good. He provided the key insight that I needed to support this feature. In the process we used to reach the solution we rephrased the problem into simpler components that capture the actual heart of the problem. In this case we have "objects" which are things like buffers and views, which we will name with capital letters A, B, C and "keys" where a key is a set of objects. All possible sets of objects are valid keys.

The connection back to scopes is that each possible key will have a scope associated with, and the objects are the things on which scopes are dependent. When you get the scope for a view or buffer, and when you create a user scope, you are getting a "basic scope" which is a scope associated to a "basic key" which is a key with one dependency.

Examples of basic keys:

1 = {A}
2 = {B}
3 = {C}
4 = {D}


A scope with multiple dependencies is a scope associated with a key with multiple elements. In other words keys like the following:

5 = {A, B}
6 = {A, C}
7 = {A, D}
8 = {A, B, C}


When we want a scope dependent on two different buffers we will get the scopes for each individual buffer, then pass the scopes into a call that builds the union of their keys and uses the new key to lookup and return the scope with multiple dependencies.

The tricky part of this is making sure we have enough information to do all the necessary cleanup work when we destroy an object. Obviously keys are constructed as a set of objects, so one option is that when we destroy object A, we iterate all keys and destroy any key that contains A, removing it from whatever accelerated lookup structures we use. But I didn't like this because I was expecting there to be a lot of keys. After all there is automatically one for each buffer and view, then for any union of basic keys that has been used at least once, those keys exist too. The key insight that I needed to put this whole thing together were following the steps for creating keys and destroying objects, described below in psuedo code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
struct Key{
    struct Object **objects;
    int object_count;
};
struct Object{
    // NOTICE: the object stores the list
    // of keys that currently contain it
    struct Key **keys;
    int key_count;
    int key_max;
};

Object* create_object(){
    return(AllocAndClearToZero(Object))
}

Key* get_key(Object **objects, int object_count){
    SortPointers(objects, object_count);
    object_count = RemoveNeighborDuplicates(objects, object_count);
    Key *key = LookupKeyInLookupStructures(key);
    if (key == 0){
        key = AllocAndClearToZero(Key);
        key->objects =
            AllocArrayAndCopy(Object*, object_count, objects);
        key->object_count = object_count;
        // NOTICE: when we create a key we iterate the
        // member objects and insert ourself into the
        // object's key list
        for (int i = 0; i < object_count; i += 1){
            StretchBufPush(objects[i]->keys,
                           objects[i]->key_count,
                           objects[i]->key_max, key);
        }
        InitObjectsAttachedToKey(key);
        InsertKeyIntoLookupStructures(key);
    }
    return(key);
}

void internal__destroy_key(Key *key, Object *cause_object){
    for (int i = 0; i < key->object_count; i += 1){
        // no point in doing anything to the cause object
        // it is about to free itself
        Object *object = key->objects[i];
        if (cause_object == object) continue;
        // we remove key from this object
        // TODO: accelerate to avoid linear search?
        for (int j = 0; j < object->key_count; j += 1){
            if (object->keys[j] == key){
                object->key_count -= 1;
                Swap(Key*, object->keys[j],
                     object->keys[object->key_count]);
                break;
            }
        }
    }
    CleanupAnythingAttachedToKey(key);
    RemoveKeyFromLookupStructures(key);
    Free(key->objects);
    Free(key);
}

void destroy_object(Object *object){
    // NOTICE: we have the set of all keys that
    // contain object and we iterate it to destroy
    // all dependent keys
    for (int i = 0; i < object->key_count; i += 1){
        internal__destroy_key(object->keys[i], object);
    }
    Free(object->keys);
    Free(object);
}


The parts not in CamelCase constitute the heart of the system. We have objects that we create and destroy, and keys that we create under the hood whenever we first try to get them, and then are destroyed when any of their objects are destroyed. Creating a key puts it into all of the appropriate key lists for the appropriate objects. Destroying an object destroys all of it's keys, and destroying a key causes it to remove itself from all of it's objects. I usually don't like letting something get too complicated, and this definitely qualifies as complicated, but now we have the task of comparing it to the alternative.

The alternative is that everytime someone wants a scope with multiple dependencies they start going through the process of building lookup tables so that when an object gets destroyed they can find all of the data structures they need to cleanup. If you scroll back up to Bad scenario one you will see the comment:
1
2
3
// By the way, when we close buffers that have
// markers attached, do we just leave the stale
// handle in the jump list, or... ???


And if you look at Bad scenario two you will see the List_Of_Lists_That_Refer_To_A_Buffer. I get the distinct impression from all of this that this pattern of self refential business, all this "keeping a list of places where there are pointers to me so I can remove them" stuff is just the general nature of this problem. Whether or not this is the simplest or best solution I don't know. What I do know for sure was that going case by case wasn't actually making it easier, the extra context of a specific case wasn't enough to simplify the problem. Maybe someday we'll find a simpler solution for this class of problem, but for now, I think we're all very likely to be rewriting the same pointer webs everywhere. I thought I might as well move all the pointer webing into the core, do it once, and get it right so that we never have to do it again for this class of problem.

All of that thinking and considering boils down into one little call:
1
Managed_Scope get_managed_scope_with_multiple_dependencies(Application_Links *app, Managed_Scope *scopes, int32_t count);


This call really does a union operation on the scopes you pass in. If you pass the same scope twice, you'll get that scope back:

1 = {A}
1 union 1 = {A} union {A} = {A} = 1


If you pass a scope that was already the result of a union, it will be unioned again:

5 = {A B}
6 = {A C}
8 = {A B C}
5 union 6 = {A B} unkion {A C} = {A B C} = 8


Looking at the scopes in this way, it occurred to me that I was missing the scope for the empty set. Such a scope would represent a "global" scope. It has no member objects, so it requires no key to acquire it; when you're thinking about keying "requires no key to acquire" means global. This scope also had the unique property that nothing would ever be able to destroy it, since it is non-basic it has to be destroyed by destroying an object, but no object ever can destroy it. So, quite appropriately, the global scope is a scope that lasts forever (well until you close 4coder, as if that ever happens :D). You could acquire the global scope with get_managed_scope_with_multiple_dependencies by passing in no scopes, but go make life easier the API provides the call:
1
Managed_Scope get_global_managed_scope(Application_Links *app);


This scope, being associated with the empty key, has the unique property that it has no effect when it's key is unioned to other keys:

0 = {}
4 = {D}
0 union 4 = {} union {D} = {D} = 4


This might seem confusing, but it is actually quite appropriate. The global scope always exists, so it should be meaningless to say that something depends on the lifetime of the global scope.

Finally, just as it is useful to do bulk clears of single scopes, it also strikes me as occasionally useful treat a basic scope as an identifier for the single object it contains, and clear all the scopes that depend on the same object as the basic scope.
1
bool32 clear_managed_scope_and_all_dependent_scopes(Application_Links *app, Managed_Scope scope);


A Scope with Multiple Dependencies Example (Adjusting to the new whacky world we just created)

Okay that was a lot, let's see how this thing looks in action. Let's return our thoughts to the sticky jumps problem, and solve it with this new power. We can now put the markers into a scope that is dependent on both buffers. We still need a single list of all the object handles spread out across all those scopes, so that we can find what we're looking for from just the jump buffer. To this end we will create an array that stores information per-jump-line. The information will include the id of the destination buffer, a subindex that tells us that this is the Nth jump from this list that goes into this buffer, and the line number of the jump line in the jump buffer.

Triggering a jump looks something like this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
void jump_to_line_at_cursor_nearest(Application_Links *app){
    View_Summary view = get_active_view(app, AccessAll);
    Managed_Scope jump_scope = buffer_get_managed_scope(app, view.buffer_id);
    Managed_Variable_ID jump_array_varid = managed_variable_create_or_get_id(app, "STICKY_JUMPS.jump_array", 0);
    uint64_t value = 0;
    if (!managed_variable_get(app, jump_scope, jump_array_varid, &value)){
        return;
    }
    if (value == 0){
        return;
    }
    Managed_Object jump_array = (Managed_Object)value;
    int count = managed_object_get_item_count(app, jump_array);
    Jump_Data *jumps = AllocArrayAndClearToZero(Jump_Data, count);
    managed_object_load_data(app, jump_array, 0, count, jumps);
    int best_index = find_best_jump(jumps, count, view.cursor.line);
    Jump jump = jumps[best_index];
    Managed_Scope dest_scope = buffer_get_managed_scope(app, jump.dest_buffer_id);
    if (dest_scope == 0){
        return;
    }
    Managed_Scope scope_array[2];
    scope_array[0] = jump_scope;
    scope_array[1] = dest_scope;
    Managed_Scope marker_scope = get_managed_scope_with_multiple_dependencies(app, scope_array, 2);
    Managed_Variable_ID marker_array_varid = managed_variable_create_or_get_id(app, "STICKY_JUMPS.marker_array", 0);
    if (!managed_variable_get(app, marker_scope, marker_array_varid, &value)){
        return;
    }
    if (value == 0){
        return;
    }
    Managed_Object marker_array = (Managed_Object)value;
    Marker marker = {0};
    if (!managed_object_load_data(app, marker_array, 0, 1, &marker)){
        return;
    }
    DoJump(app, jump.destination_buffer_id, marker.pos);
    Free(jumps);
}


If that looks like we've had to do more work here than we would have done without managed variables and objects and scopes everywhere, you're absolutely right. This code would be half as long if all the memory was allocated and managed directly by the custom side. However by doing the extra work to pass all memory management through this system, we are now completely free from having to write any freeing code for markers and lists, and while this is a lot of code, I find it much easier to get right than the "list of lists with pointers to me" type code.

We still leave stale handles behind, but now instead of those stale handles being pointers that could refer to memory that has been given a new purpose or taken out of the valid address space, now stale handles are gracefully handled by the API which checks and indicates that the handle has gone stale allowing us to prevent a crash and possibly even take appropriate actions in the absense of a viable jump destination.

Thanks for reading through on this whirlwind of a post! Next something much lighter: the entire UI system and series of layers of wrappers for listers, see you then!
Allen Webster
Intro

It looks like the next build of 4coder (4.0.29) is going to be ready sometime in the next few weeks. The new build has been in development for a couple months and is loaded to the brim with new features that have all gone through interesting architectural and algorithmic design that I believe are worth sharing for several reasons. One it will prepare 4coder users who want to start writing customizations for how to think about the new features. Two it will help anyone who likes to think about the processes of architecture and algorithm design with some examples of my own process. Three it might expose me to some criticisms or suggestions that could help me improve the specifics of the new 4coder features before I put them into the wild.

Directory of all parts:
  1. Memory Management Overview
  2. Memory Management Variables and Objects
  3. Memory Management Scopes
  4. Custom UIs and Various Layers for Lister Wrappers
  5. Custom Cursors, Markers, and Highlights, and the Render Caller


"The Variable"

First let's take a look at the variable feature. We require that it be as easy as possible for users to get the handles to variables without having to ever pass them around if that will be too much hassle. Another way to put it is that we want our handles to be compile time constants. However if we start using compile time constants, we also want to think about how to avoid collisions between modules that don't know about each other, so just using an enum-style compile time constant doesn't work. If my module think variable 1 stores the previous index of the "paste next" command, and your module thinks it stores the mode for a vim emulation, we're going to have a really big problem.

Instead variables are named by strings. Alice's module can set "ALICE_IN_CLIPBOARD_LAND.index" and Bob's module can set "BOB_VIM_MASOCHIST.mode". There's no rule at all about what the string can be, except that it should be non-zero in length and have no nulls before the null terminator. I would like to encourage users to use distinct module names followed by a dot for every variable name, but I don't intend to force anything.

There are other ways I could go about structuring this. Instead I could have everyone name their module with a string, and then within each module use an index to get to each variable. However, I'm not convinced this would actually be better. Customizers still have to use a string to query the module anywhere they interact with the variable, and if Alice and Bob pick the same module name we still have to do a replace all on one of the two modules to integrate them. If anyone has another idea, let me know!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Module Names vs Variable Names
// I don't see any real advantage to one or the other as an API
// but it takes more to implement the module version.
void do_foo_module_version(Application_Links *app){
    uint64_t value = 0;
    read_variable(app, "FOO_MODULE", FooVariable_Bar, &value);
}
void do_foo_variable_version(Application_Links *app){
    uint64_t value = 0;
    read_variable(app, "FOO_MODULE.bar", &value);
}


What should we be worried about with this system? In a statically compiled language we rarely have to worry about variable misspelling mistakes, this variable feature threatens that possibility. There are a few solutions to this. One we can do in the API design, and another users can do in their usage code. Another issue we want to think about is reducing the frequency with which we have to hash a variable name when the usage side could be storing the result of a hash and reusing it. In the API design we can help with both of these problems if we must call a "create variable" before it can be used (as opposed to creating it on the fly the first time we read or write it) and having the create variable return some kind of fixed width handle that doesn't need to redo the hash.

These concerns lead to the API:

1
2
3
4
5
Managed_Variable_ID managed_variable_create(Application_Links *app, char *null_terminated_name, uint64_t default_value);
Managed_Variable_ID managed_variable_get_id(Application_Links *app, char *null_terminated_name);
Managed_Variable_ID managed_variable_create_or_get_id(Application_Links *app, char *null_terminated_name, uint64_t default_value);
bool32 managed_variable_set(Application_Links *app, Managed_Scope scope, Managed_Variable_ID location, uint64_t value);
bool32 managed_variable_get(Application_Links *app, Managed_Scope scope, Managed_Variable_ID location, uint64_t *value_out);


Now users have various options for creating and getting variable ids from their compile time constant names. Technically the names don't even have to be compile time constant, but the API is not trying to help you operate that way, which is why it takes null terminated strings without length specifiers. Anything you would like to do with names generated at runtime should turn out better when implemented through objects, which are meant for more intricate data types. If you're not worried about hashing work, and usually you won't have to be worried about it, you can just use the "create or get" option every time you operate on a variable. If you do care about the hashing, you can store the variable ids in global integers. This will work just fine for now, but when 4coder supports reloading and swapping out the custom layer, global variables will have to be managed with extra care.

Users can reclaim their compile time spellchecking by putting variable names into global constants.

Example of using the variable:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
static const char *aabbb_parent = "ALICE_AND_BOBS_BUFFER_BOOPER.parent";
CUSTOM_COMMAND_SIG(switch_to_parent_buffer)
CUSTOM_DOC("If this buffer has set it's parent buffer, switch to the parent buffer")
{
    View_Summary view = get_active_view(app, AccessProtected);
    Managed_Scope buffer_scope = buffer_get_managed_scope(app, view.buffer_id);
    Managed_Variable_ID varid_parent = managed_variable_create_or_get_id(app, aabbb_parent, 0);
    uint64_t parent_buffer_id = 0;
    if (managed_variable_get(app, buffer_scope, varid_parent, &parent_buffer_id)){
        if (parent_buffer_id != 0){
            view_set_buffer(app, &view, parent_buffer_id, 0);
        }
    }
}


Objects (sans Orientation)

If you're anything like me the word object immediately puts you in a state of unease. The word object is usually reserved for object oriented programming paradigms that I have long ago learned to avoid. In fact the objects in this API are not object oriented at all, it's just that no other generic non-descriptive noun fits very well either. "Entity" is even more reserved for game engines than "object" is for OOP. These objects work as memory allocations, but they can often work as more than just a memory allocation so "Memory" feels inapropriate as well as "Array". "Buffer" would be very appropriate but in a text editor that word binds more closely to the text storage system and would be very confusing. Although I find it unfortunate, "Object", is the best I have been able to do, so instead of infering meaning from the name, try to focus on how I describe it.

Objects were first introduced to store arrays tied to buffers and views. Arrays could have been simulated with variables by appending an index to a string name and doing everything in variables, but I would like to stay out of the business of generating variable names at run time. If we think of an object as a generic array system, most of the API just materializes from the simples possible setup. We will need an allocation call, a free call, a call to store data into the array, and a call to load data out of the array. We have the choice of either always doing byte arrays and letting the usage code manually multiply by sizeof(T) or having the user specify the allocation by item size and count.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Always by Byte vs Indexed by Item Size
// Do we want more control at the site of store/load?
void foo_by_byte(Application_Links *app, Managed_Scope scope){
    Managed_Object object = alloc_managed_memory_object_in_scope(app, scope, sizeof(Foo)*100);
    Foo foo[100];
    make_up_these_foos(app, foo, 100);
    // I have more freedom here to do any byte indexes I want,
    // I am not a slave to what the alloc call thought this
    // memory was for.
    managed_object_store_data(app, object , 0, sizeof(Foo)*100, foo);
    Assert(sizeof(Foo)*100 == managed_object_get_size(app, object ));
}
// Do we want more control at the size of alloc?
void foo_by_byte(Application_Links *app, Managed_Scope scope){
    Managed_Object object = alloc_managed_memory_object_in_scope(app, scope, sizeof(Foo), 100);
    Foo foo[100];
    make_up_these_foos(app, foo, 100);
    // My freedom here is gone :( I have to follow the
    // addressing rules set by the alloc call.
    // At least I have less to type!
    managed_object_store_data(app, object , 0, 100, foo);
    Assert(sizeof(Foo) == managed_object_get_item_size(app, object ));
    Assert(100 == managed_object_get_item_count(app, object ));
}


I eventually decided to use separate item size and count, but the reasons for that will not be clear until part 5 of this series.

As I said sometimes objects act as more than just "Arrays". The codebase currently supports two types of objects, and there is a third type very clearly defined in my head that I will probably be adding either in this build or in a subsequent build not too far off. All object types share the trait that they have an item size, an item count, can store and load data, and can be freed. The presence of different types basically just means expanding the API to one allocation call per type, and adding a get type query.

Put together the api for object features:
1
2
3
4
5
6
7
8
9
Managed_Object alloc_managed_memory_in_scope(Application_Links *app, Managed_Scope scope, int32_t item_size, int32_t count);

int32_t managed_object_get_item_size(Application_Links *app, Managed_Object object);
int32_t managed_object_get_item_count(Application_Links *app, Managed_Object object);
Managed_Object_Type managed_object_get_type(Application_Links *app, Managed_Object object);
Managed_Scope managed_object_get_containing_scope(Application_Links *app, Managed_Object object);
bool32 managed_object_free(Application_Links *app, Managed_Object object);
bool32 managed_object_store_data(Application_Links *app, Managed_Object object, uint32_t first_index, uint32_t count, void *data);
bool32 managed_object_load_data(Application_Links *app, Managed_Object object, uint32_t first_index, uint32_t count, void *mem);


This list of calls includes the allocator for managed memory but the other specialized type allocation call is not included because we have a lot to learn and discuss about the new systems before it will make sense.

A Managed_Object handle fits in 64-bits and converts to unsigned integers so the handles can be stored in variables. One thought I have had is using "named objects". Essentially using the variable method of putting a handle onto an object, but then using the object method of allocation. The initial reason I resist that thought is that then every call involved with creating variables and every call involved with querying and modifying objects has to be duplicated into an incompatible set of signatures. If Alice has written some generic code that can do some work given an object as a parameter, and Bob has stored his object as a "named object", we would ideally like YOU to be able to use both of these systems, and pass Bob's object to Alice.

How is that going to work? If Alice has implemented her thing to work with the Managed_Object and names, she will either have to duplicate her code, or she will have to make some kind of generic wrapper for both handle types. We obviously don't want her to duplicate code, and if she has to write her own wrapper, then so will Bob and every other module, and they'll all be incompatible. We could provide a standard wrapper, but the it sounds to me like we added two entry points to a system, and then didn't actually want two entry points, and so wrapped it back up into one entry point. If we really want to write the code:

1
2
3
4
5
6
void foo_named_object(Application_Links *app, Managed_Scope scope){
    Managed_Variable_ID id = managed_variable_create_or_get_id(app, "FOO_MODULE.bar_object", 0);
    Foo foo[100];
    make_up_foos(app, foo, 100);
    named_managed_object_store_data(app, scope, id, 0, 100, &foo);
}


We can just make named_managed_object_store_data as a wrapper, and then Bob's code can work that way, and when we want to pass to Alice we just go below the wrapper and get the Managed_Object. I am also not convinced we ever want this so badly that it's even worth building the wrappers, but if it does become an issue, this is the plan.

In the next part, I will go into scopes, which brings this all together and is way more intricate than you might think!