/
Back-end Design Document

Back-end Design Document

The Editor

The back-end of the CSESoc CMS revolves around a “modular” architecture the idea is that we maintain one central instance of a document managed by a centralised “document manager”, this manager is responsible for creating new document instances, loading the contents of documents from disk to memory and spinning up/connecting extensions. The modular part of the design comes from the idea of extensions, extensions represent a single unit of functionality that is required to be performed on a document, some examples of functionality include: concurrent editing for clients outside of the process, automatic backup of the document state back to disk and live-preview generation of the current document for consumption by clients. The key advantage of extensions is maintainability + scalability, extensions allow for one feature to not affect the performance/execution of another feature; they also make it extremely simple to extend the functionality of the editor. Additionally, the development process is rather isolated; extensions can be developed independently of each other.

Extensions

As stated earlier extensions represent a single unit of functionality that needs to be performed on a document; there are 4 main categories of extensions, administrative, non-administrative, client-facing and non-client-facing. Administrative extensions have the ability to open/close document instances while non-administrative extensions cannot do either. Client-facing extensions have exposed HTTP/Websocket endpoints which can be consumed by a client, it is important to note that an extension can be both client-facing and administrative.

The opening/closing and loading of extensions is achieved via a centralised document manager, the manager exposes a set of methods which can be consumed by administrative extensions, these methods read files from disk, create document instances, start + load other extensions and close documents + all associated extensions.

There are 3 key parts of an extension’s lifecycle:

  • Startup / Construction: During this phase the extension is given the state of the document it is being attached to, it can then perform some preprocessing on this state or whatever else it wants.

  • Running: Each extension exposes a .Spin() method, to begin running the extension this .Spin() method is called. Most of the functionality for an extension is within the .Spin() method.

  • Termination: During termination the .Close() method is called, a signal is given to the .Spin() method to terminate itself, the extension can also additionally perform some final cleanup operations.

Extensions are programmed in an event-driven manner, all extensions support updates to document state + synchronization operations, and their respective callback functions are called.

MVP Extensions

The initial release of CSESoc CMS backend will support the following extensions: concurrent editing, automatic backup and preview generation.

Concurrent Editing

Concurrent editing will be a client-facing administrative extension, a consequence of this exposed behavior is that prior to the loading of documents, permissions and privileges must be tested for the client that wishes to consume this extension. The concurrent editing extension exposes a HTTP endpoint that will be upgraded to a websocket connection if the client has sufficient permissions. The general design of the concurrent editing extension is as follows:

  • A client requests to edit a document, first the clients permissions are checked by an authentication/permission service, if the client is capable of opening/accessing this document then the concurrent editing extension creates an “edit session“ for this client.

  • Internally the edit session is a structure containing a client shadow, this is required for document synchronization as per the differential synchronization algorithm, additionally the edit session structure contains the client ID and a pointer to the web-socket connection buffer. Internally within the concurrent editing extension is a map of client IDs to the ID of the document they are trying to edit.

  • The concurrent editing extension then lodges a request to the document manager to load the file from disk into memory and load up the automatic backup and preview generation extensions.

  • When a client pushes changes the concurrent editing extension finds the associated document and requests for the document manager to synchronize the internal state of the associated document with the pushed changes, once these are made all additional changes are then propagated back to all other clients associated with this document.

An important thing to note here is that when changes are pushed all loaded extensions also receive these changes, that's the entire idea behind extensions.

Preview Generation

Preview generation is a non-administrative client-facing extension, the implication of this is that security isn't a concern . Preview generation exposes a HTTP endpoint and serves a compiled version of its local document state at that endpoint, the internal details of preview generation are remarkably simple so not much has to be said about it.

Automatic Backup

This is a non-administrative, non-client facing extension, its sole purpose is to expose write the current state of its document to disk every n-seconds; once again a rather simple extension.

Possible Further Extensions

Once the MVP is completed some possible ideas for additional extensions + functionality are collated as follows:

  • Diff Accumulation: One simple idea is to accumulate changes made to a document and then email an owner of the document once changes have been made.

  • Duplicate Detection: On a more complicated note, one possible extension is the real-time scanning of a document to detect if it is contextually similar to another document that has been published in the past.

File storage + Permissions

File storage + permissions management is a fairly bog standard system the file system service exposes a set of public APIs consumable from the front end, these APIs include:

POST storage/create [type: DOCUMENT | DIRECTORY, name, if DOCUMENT: underlying template, optional: accesibility] GET storage/children [DIRECTORY name] GET storage/info [documentID | path: root -> parent, documentName, clientToken] -> {ID, myPermissions, underlying template, snapshot}

The set of APIs is rather simple, storage/create creates a new document or directory with a specified name, optionally we can specify who can access this document/directory. storage/children exposes a all the children for a specified director and storage/document returns all relevant information regarding a document, not this expects a registered client token.

Authentication

Authentication happens via an authentication service clients log in with a username + password and are issued a client token, to maintain state this token is propagated throughout the entire duration of a client’s connection.

 

Related content

UNSW CSESoc