Follow

data structure stuff 

let's say that i have a sequence of objects which each reference a same mutable set

i want to store that data in a way that i can at any time quickly access the set, in the state it was when i stored the object

i don't care about what's in the set outside of its current state and the state it was when individual objects were stored

objects can be deleted and once they have been, i don't care about the state the set was at that point

a possibly acceptable approximation may be to have a few extra items in the set when accessing it compared to what it was at the point the object was stored, but i'd like to avoid that

the set is expected to have fewer changes than objects stored

i want storing objects to be fast, reading the state of the set from the object to be almost as fast (it can be a bit slower), and the space it takes to be minimal

is there a data structure that seem obvious to you for that purpose?

re: data structure stuff 

@Claire I'd say: make the start set immutable. for each state change, store the differences relatively to.previous state (chain)
each class has a pointer to some element in that chain (to access the set in the form it was at the time of storage)

re: data structure stuff 

@kouett then access gets slower and slower with time :/

re: data structure stuff 

@Claire why ? if you have a pointer it's ok ?
I may not be up.to the task/or.have correctly understood yr pb lol :p
hope you'll find an answer

re: data structure stuff 

@kouett if you store the difference relative to previous state each time, to get the full state you have to walk through the whole chain

re: data structure stuff 

@Claire hmmm yeah you're right :x

data structure stuff 

@Claire
Can modifying the set be slow? I’m thinking: do a copy before performing a modification to the set. Reference that copy on object storage.

re: data structure stuff 

@lunar it shouldn't be too slow, no, nor make too many copies

re: data structure stuff 

@Claire @lunar modifying the set does not need to copy the whole thing, it can be O(log n), only copying the portion of the data structure it needs to modify, and reusing existing pointers for the rest

Implementations of sets for functional languages often behave this way. Eg, if I have a set s in haskell, insert foo s returns a new version, anything that stored s still has access to the unmodified version of it.

re: data structure stuff 

@Claire Can the number of elements in the set get very large?
In case it has less than a few hundreds elements, you might keep all elements that belong to at least one set in a vector v, and represent each state as a bit vector in which bit i indicates whether v[i] was present in the set at that point.

If you have more elements and can tolerate an overapproximation of the state, the same approach could still work with bloom filters instead of bit vectors, though that would make deleting elements from the set more difficult.

re: data structure stuff 

@caro i don't have definitive numbers on this and it's pretty open-ended but i expect the number of elements in the set can grow to several thousands, though in most cases it would be less; varies wildly from one case to the other but the same set shouldn't vary wildly over a short period of time

re: data structure stuff 

@Claire All right, thousands of bits of state per object is probably a bit too much

Another option might be to store the sets as a binary tree. Each node in the tree contains the changes that happened (elements added/removed) between its left subtree and its right subtree. Each time you modify the set, you add a new leaf to the tree, and new nodes as needed. You can compute the state of an arbitrary leaf by traversing the tree from/to the root.

This is just like what @kouett proposed, except using a tree instead of a linear chain, so you can compute the state in O(log n) time, at the cost of some redundancy.

Sign in to participate in the conversation
Mastodon (instance perso)

This is a small personal instance running on a couple small ARM servers at home.