reimplement sync client
-
migrate to non-SQLite backend that is compatible with Android (SSTable is now, is not using hard links anymore) -
take into account 'commit time' of entries -
implement historical syncing -
super simple client library where it is easy to swap to a different persistent storage (e.g. browser based) and different wire protocol (e.g. wrapping the messages in WebSockets frames) -
see #478. support delta updates of entries: by sending over only the modified fields of a BinaryJSON entry, smaller updates can be made. This allows for simpler data models where data does not need to be split according to data lifetimes (no combining is needed in the client). Be careful with deleted fields. Probably only works with historical syncing and/or a live update system (not sending over the state hashes, and sending diffs). To implement it efficiently server side, probably need one of two things:when an entry is modified, also send over the old object (so a diff can be made) (implemented inBinaryObject::CreateDiffObject()
);-
for composite objects (from multiple sources), in the metadata of the entry store the last-modified of every field.For (2) would be nice if the streaming store could automatically calculate and set those metadata fields.
-
make opt for an easier data setup: no direct writing by clients in main data store, only indirect via server. The server writes all updates/deletes to a separate table, that is processed by a script. This script can account for merge strategies (such as highest, latest, total). Avoids race conditions too, as currently the remove totally forgets about the item (if an earlier update comes in later at the server, it will overwrite the delete). This can make the server side multi-master. Even removes are just 'action' send to the server. Drawback: item disappears from client for a short period of time (can be avoided by having a two-step sent: first one is received confirmation, second one is processed confirmation). -
allow for certain frontend paths to be mapped to different backend paths with a certain selector. Also support filters based on 'tags', e.g. tag topbeers
onbeers
maps tosamson/beers
with filtertopBeers???false
. -
switch regular (non- download()
) payloads to binary objects with a couple of reserved fields: starting with '_' (such as_mtime
). This will allow for efficient diffs and to combine information in a single object (withcombine()
). WARNING: probably need aBinaryObject::Hash
method, that hashes in a canonical way (so compact versions of object will result in the same hash). -
remove regular types, encode in path prefixes (to prevent confusion between: path names, types, and filters).
Open:
-
how to support binary, BinaryJSON, and other types? Work with post-processing filter? Simple option: live subscribe is always BinaryJSON, once time can be any type. -
easy way to combine fixed content items, with your own ratings/notes of that item. Because of sorted property, maybe use client side merging using same ids? With SSTable::Merge
. So user/rating/42 and beer/42 will be merged by doing acombine("user/note", "user/rating", "beer")
resulting in{"user/note":{...},"user/rating":{...},"beer":{...}}
objects. Initial send can be implemented efficiently, but updates might need to access the data. This need to be a primitive inPushBackend
itself. -
How to sent over all the items relevant for a personal profile? Setting for every user a separate tag is not scalable. This applies for items from the common/general part of the data store that should be available for the client (e.g. all the beers a user has rated). Solution: define 'active' sets, each of the active sets have multiple expressions over a streaming store path converting entries under that path to ids that the client should sync. This convert e.g. the user profile to ids of items the user has rated, so these are always available. For this to work, aggregate functions in StreamingServer should work because multiple entries can map to an id, and it should live account for changes. This results in a map from entry to a count (how many times the id was generated). If count is zero, the client should not have the entry anymore. Alternative: do the aggregation in the syncd daemon. Implement a 'map()' function in StreamingStore (an initial set of ids, with a function to update these ids). That map()
results in the items. This will costs some memory in the streamingstore. Multiple 'source' expression can be enabled/disabled (for example: all the venues that should be cached), and this results in one item set.
Probably use one ResourceOwner that does both the network as the storage itself. For a client, this is ok (storage and network are not the bottleneck, and saves resources). The data structures are:
- queue of changes
- store of bulk of entries
- some meta information (userid, token)
Setting up a connection can be done with:
- hostname + port + domain
- prefixes with types therein (so no global types)
API:
- store (store and upload)
- remove (id, or path)
- upload (upload only, useful for stats)
- download (do not store locally, only once download, for e.g. images which can be stored in a LRU cache or something)
- subscribe (path)
- logout
- create user
- login
- request password reset
- change mail
- change password
- temporarySync (add temporary path + types, with ttl, useful for large data sets that you don't want locally)
- status() rx endpoint for authentication errors or sync errors
- registerNotificationChannel(WebPush,FCM,ApplePush, [<path, [types]>] notificationsFor)
Simple mmap backend supported on Android:
- Create files of 1 MiB each, containing entries.
- New entries are written to the current one.
- If an old file needs to be recycled, write the correct entries to the current file. Recycle if 75% of storage is not needed anymore.
- Tombstones can go in the current file (to avoid race conditions)
- Optional: have 3 tiers of age. If data from the first is recycled, put it on the next tier. Can reduce the number of bytes written.
- TODO: are there race conditions? Probably parts of file can be missing.
- In short run: use SSTables (are compatible with Android now)
New version of Sync protocol and library, allows for on demand retrieval and easier integration across language barriers (by minimizing conversions):
- binary JSON filter expression is set on every prefix, and can be changed dynamically (so only items are sent over about current item on screen);
- every subscribe() also supports binary JSON filter expression (possibly a selector?);
- separate methods for upload and download (not cached);
- consequence is that all cached items are binary JSON (!), and there is possibly no need to store different versions server side.
Library support needed:
-
validate BinaryJSON expression so that it is valid, and does not contain certain expensive operations (like hashing) and does not access certain fields; -
Rx should incorporate paging/window size.
Edited by Bernard van Gastel