A Insert-only table is a database table strategy that keeps all the past versions of an object in a simple and lightweight way . It allows the system to efficiently represent the state as it used to be at a past point in time and show changes over time. No self-referential are needed, avoiding costly query strategies. Single rows keep the full-state at a point in time, avoiding the need to query many rows to recover a single object.

Properties of this model

An insert-only table requires the following columns:

And all the other columns you want to add of course.

The Trigger magic

There needs to be a trigger on insert that selects a row with the same "oid" and ts_end of null and updates it with the value of the "ts" of the row being inserted.

CREATE TRIGGER IF NOT EXISTS rules_rule_insert_trigger BEFORE INSERT on rules_rule 
    FOR EACH ROW 
    BEGIN 
        UPDATE rules_rule SET ts_end = new.ts WHERE oid = new.oid AND ts_end IS NULL; 
    END;

Creating an object

INSERT the object with:

Updating of an existing object

Instead of updating, Insert an object with:

This will start the trigger which grabs the ts value of the row your inserting and stores it in the ts_end of the row with the same oid and ts_end of null

Deleting an object

Deleting an object is also a insert, exactly like updating except that:

Undeleting an object

Since we don't remove anything, undelete is possible by following the update process on a "deleted" oid. Magic.

Reading objects

Generally you care about what is currently active. This is done by selecting rows that have ts_end of null and op <> 'D'.

Getting all the history of an object is of course as simple as selecting on the oid and optionally ordering on ts.

References to objects

Depending on the usage, it's sometimes better to refer to an exact version of the object (vid) or the object independent of it's exact version (oid). This system allows both. If this is hard to choose, it means you need to design the temporal dimension of your data model better.

Speed it up with Indexes

To speed up updates and deletes, add the following index:

create unique index rules_rule_oid_ts_end_uniq on rules_rule (oid, ts_end);

This also guarantees that you can never have two versions of the object that are current at the same time.

If you have a usage pattern with a lot of objects that is heavy on querying for the latest state of the objects, you can add an index of the oid with a where clause on a ts_end of null. Add the additional condition on op <> 'D' if you also have a lot of deleted objects slowing it down. Like so:

create unique index rules_rule_active_idx 
    on rules_rule (oid)
    where ts_end IS NULL and op <> 'D';

Hiding the truth

You can imitate a traditional table (read-only) by creating a view where only rows with ts_end is null and op <> 'D' . In the fields only show oid and ts as the last update time and nobody will ever be any wiser.

CREATE VIEW
rules_rule_active (oid, ts, …)
AS select oid, ts, …
    FROM rules_rule 
    where ts_end IS NULL 
        and op <> 'D';

Purging the old

If the table becomes unwieldy, you can remove all oids of objects that have op = 'D' and ts is older then a certain age.Another (complementary) approach is to remove entries with ts_end of a certain age. The only artefact will be that the oldest entry will be a op of "U" rather then "A".

Django implementation

As an example, here's the Model in Django. If there are multiple models that use this, this could be abstracted to a sub-classable Model-subclass . models.py

class Rule(models.Model):
    oid = models.UUIDField(null=False, editable=False, db_index=True, help_text="Object ID of the rule. Stable with
    different updates of the rule.")
    vid = models.UUIDField(null=False, editable=False, primary_key=True, help_text="Unique ID of the version of the rule.
    Changes with each update.")
    ts = models.DateTimeField(blank=False, auto_now=True, editable=False, db_index=True, help_text="The moment this version
    was created")
    ts_end = models.DateTimeField(blank=True, null=True, editable=False, help_text="The moment the rule no longer is valid")
    op = models.CharField(max_length=1, null=False, default="A", editable=False, help_text="Operation of this version",
    choices=[("C", "Create"), ("U", "Update"), ("D", "Delete")])

    ## The fields of your domain.
    objects = models.Manager()  # All of the rule versions.
    active = ActiveRuleManager()  # Only the currently active rules

    class Meta:
        unique_together = (("ts_end", "oid"),)  # This ensures no rule can have two current versions.
        get_latest_by = "ts"

    def save(self, *args, **kwargs):
        self.vid = uuid.uuid4()  ## Make sure every save is a new entry
        if self.oid is None:  # no oid, means this is a new entry (not an update of an existing one)
            self.oid = uuid.uuid4()
            self.op = "A"
        else:  ## it's an update of an existing entry
            self.op = "U"
        super().save(force_insert=True, *args, **kwargs)  # Call the "real" save() method.

    def delete(self, using=None, keep_parents=False):
        self.vid = uuid.uuid4()  ## Force a new row.
        self.op = "D"  # Mark it as the final state of the Rule.
        super().save(force_insert=True)  # Call the "real" save() method.
        self.oid = None

class ActiveRuleManager(models.Manager):
def get_queryset(self):
return super().get_queryset().filter(ts_end=None).exclude(op="D")

To add the trigger add this to your migrations script as well

migrations.RunSQL(
"""CREATE TRIGGER IF NOT EXISTS rules_rule_insert_trigger BEFORE INSERT on rules_rule FOR EACH ROW BEGIN UPDATE
rules_rule SET ts_end = new.ts WHERE oid = new.oid AND ts_end IS NULL; END;"""
)

Usage has shown that most of the operations only care about the currently active rules. To make this easy the Rule model has two Managers. "active" only returns objects that are currently active, while the common "objects" keeps returning all rows, allowing access to all versions of the rules. This removes internal logic workings of the insert-only table from common usages.


> > > from rules.models import Rule
> > > Rule.active.count()  # all active rules 3 # There are 3.
> > > Rule.objects.count()  # Get all versions of all rules 47 # There are 47 of those.
> > > Rule.active.latest().oid # the id of the active rule that was changed the latest UUID('24eb3231-0dc2-451d-b477-2086798a9d12')
> > > Rule.objects.filter(oid=Rule.active.latest().oid).count() # How many versions of that rule are there? 2

PostgreSQL triggers

CREATE OR REPLACE FUNCTION insert_only_update()
RETURNS trigger AS
$$
BEGIN
UPDATE rule SET ts_end = NEW.ts WHERE pid = NEW.pid AND ts_end IS NULL;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER insert_only_rule_trigger
BEFORE INSERT
ON rule
FOR EACH ROW
EXECUTE PROCEDURE insert_only_update();