Sync framework motivation

by Jeff Vroom

One of the larger complicating factors in software design, performance, and maintenance begins when two programs need to communicate. The developer needs to design or choose a protocol, perhaps adds versioning so it can be easily upraded or changed. They need to make it secure, reliable and possibly deal with retries in case of failure or handle the additional connection related errors that might occur.

In my experience, quite a bit of code goes into solving this problem. It's one of the large contributors to "code that's not application logic".

Newer frameworks generate the serialization code using annotations on methods and their parameters or use a declarative file to define the protocol which generates code for both client and server. This approach reduces the amount of code developers have to write and increases the power of the serialization layers.

This is a good first step, but I believe many client/server programming problems are really problems involving synchronizing data structures that live in different processes. If the object has a unique id, it's usually simpler to enforce the "highlander principle" in your APIs so that there's only a single instance of that entity in memory for a given thread at a time. Your code is safe knowing that it has the right instance, and you can read or write as required. This approach avoids errors that arise when you are doing more complex things in your code - communicating with a remote process which needs to commit the change to a database.

Given that philosophy, why not use synchronization as the base metaphor and build RPC on top of that? In this new model, some server types and properties are marked as synchronized for a given client with additional metadata that controls how they are to be synchronized. The client can request a data models or may be populated automatically with a set of synchronized objects automatically when it's initialized.

When the client needs to make changes the same metadata is used in the opposite direction. Changes are queued up and on the next 'sync' sent and applied on the server. Certain properties are fetched on-demand, others may be explicitly fetched using an API. When a new object is created on the server in response to a client's request, it can automatically be sync'd to the client in the reply, or lazily fetched when it's first referenced by an object being sync'd to the client. In many applications, you can reduce or eliminate the wrapper code, and many of the RPC methods you have to code by hand using declarative annotations. When RPC is needed, it works naturally with the sync framework as you can use synchronized objects in any method call. They can be passed by reference or value as required based on the sync protocol.

Synchronization and layers allow you to organize your code based on it's role in the client/server design. Code is split layers based on whether it runs on both the client and the server, or the server only, or client only. This is a nice way to break apart a design to reuse code between client and server, and to use static typing to help navigate what's available on each side and find conflicts at edit time. With this design, StrataCode can detect automatically when a method call is in a remote process for example.

Synchronization is built using StrataCode's code-generation making it easy to debug. Metadata required for synchronization is added as readable code to the generated files. To enable synchronization, fields are converted automatically to get/set methods. The set method fires the required change event to notify the sync system. When a new synchronized instance is created, the system is notified. Remote method calls can handled using code-gen with proper error detection. If the runtime does not support synchronous remote methods, this is only allowed if the method call is in a data binding expression. Otherwise, an error is generated at compile time.

Sync features

One of the advantages of doing synchronization declaratively is that you can turn on some advanced features through configuration, and by leveraging the synchronization design pattern.

With data binding and data sync in StrataCode, there's no need for asynchronous programming or function closures. You can expose a set of methods to your declarative programmers in the same way that excel exposes various functions. The declarative programmer can use them in data binding expressions without needing to know if the method runs on the client or the server. And you can change the implementation later without changing their code.

Realtime

Synchronization works in both directions. The client can queue up changes for the server and vice versa. When using polling, the client picks up any queued changes waiting for it. When using a realtime connection, the client can see the results immediately.

Sync and scopes

Synchronization is built on the "scope" mechanism in StrataCode where different objects can be configured to have different lifecycles via an annotation. An object can be global to the system, global for each top-level application (for frameworks that support more than one like a web server), one per browser session, one per application per browser session, one per browser window, or one per request. You can create your own scopes for a situation where you have a 'current something' - e.g. current tenant for a multi-tenant application, or current product for a product display component.

When scopes are nested, synchronization propagates change events from the lower-level scope to the higher level one and performs necessary locking to prevent two conflicting threads from running at the same time. So if you have a shared object between two scopes, changes made by one are queued and delivered to the other on the next thread that runs in the destination scope. For example, you can use the StrataCode command line to execute methods in a connection application running in the browser. This makes it easy to write test scripts for client/server applications. Your test script can direct commands to the server or client or both. You can use your script to record important state for verification. The StrataCode object TestPageLoader records the client javascript error log, the server log and the HTML output for the current page on both the client and the server.

Because it uses a stateful protocol, and because object graphs are preserved after deserialization, you can reduce redundant data transmission and eliminate bugs introduced when two copies of the same object instance end up on the client.

Reliability and responsiveness

Using synchronization to manage state has a nice benefit for scaling web applications. You have available a variety of scopes to cache data as needed for different purposes - global, session, or window. When your server caches information, you avoid unnecessary database accesses and reduce request latency. If the client's copy goes away, it can refresh from the server's cache so no temporary state is lost.

Similarly if that server session goes away, because it expired or the server process or some router in between was restarted, the client also maintains a copy of the synchronized state which can be used to refresh the server's session. So again, you do not lose any context. This approach is designed to allow developers to support a balance of reliability and efficiency all from a declarative model. I believe ultimately it will be an easy way to produce scalable web applications with low latency, simpler database architectures, and cheaper infrastructure cost. You are probably already caching data with memcached or Redis or some other "out of process" tool. Eliminate that and have lower latency and more efficiency.

When you really need stateless, you just use 'request' scope at the page, or component level and the same models work fine, they just transmit all of the state across the wire on each request. Data binding supports a really nice way to still do declarative RPC in a stateless environment, using the sync system to avoid re-transmitting the same object twice.

Easier multi-process development

There are many advantages to decoupling components that can be efficiently managed independently. We've all read about Microservices and the success of businesses like Amazon which made decoupling of systems a major priority. But the downsides of a decoupled system are the design costs of choosing the pieces and the complexity of standing up a complete environment composed of many pieces. Developers need a quick and reliable workflow where they can change code, build, run, and test even when they are touching code in more than one process.

With the StrataCode system, you can incrementally build all processes from the same stack of layers. The layer definition files are also the perfect place for the code that updates and restarts processes as necessary. So for any project, just running 'scc list-of-layers' should be the way developers work, and operations deploys no matter how many processes or environments are required to let them run their code. Layers can handle all of the details - including all of the configuration specific to this user's environment. Because StrataCode itself is built on code-generation, supports maven, multiple processes, customizing and running scripts a system like 'chef' is not really needed. Dev-ops can manage their own set of production layers which keep their environments separate from development.

See these examples: UnitConverter, TodoList, and Program Editor.

Client/server synchronization using layers

Today client/server programmers commonly use remote procedure calls (RPC) with a serialization protocol like JSON to communicate with the server. Some frameworks require that you write code to convert Objects to and from JSON, others handle that transparently, or declaratively often via code-generation. The Sync framework also supports RPC and object serialization using code generation. You can add the @Sync annotation or call SyncManager.addSyncType with the property metadata.

The biggest hassle with direct RPC code is that you have to handle the response asynchronously (at least when you are using an asynchronous runtime like the browser). Data binding gives you an option to avoid response listeners - use bind to the remote result, or trigger the remote method in a reverse-only binding. When multiple changes fire at the same time, often due to chained bindings or methods run from a binding, the stream of change events is sent to the server and the stream of results are returned. A set of properties are updated and the UI is updated in response to those changes.

When you use this model, you have declarative/functional access to data you retrieve asynchronously, and can cleanly invoke remote methods in response to events even when they are async. The same data binding expressions will work in either a client/server environment, or local which makes it easier to reuse domain model code, even when the domain model has remote methods. This is particularly valuable when a business user is editing data-binding expressions that control some combination of client and server functionality.

By it's nature, RPC involves copying state back and forth explicitly. Programmers have to write that code explicitly and spend a lot of time with tricky code that amounts to reimplementing the same patterns over and over again. What you are really doing here is synchronizing objects, or parts of those objects, between the client and server. StrataCode offers a model for doing just that - synchronize objects automatically using declarative patterns.

You annotate classes, objects and properties with the @Sync annotation, or put them in a shared layer and annotate the layer itself with @Sync. If you set @Sync on a layer, StrataCode detects which layers overlap between different processes and generates code to synchronize just the overlapping classes, objects, and properties back and forth.

When a type is marked with @Sync, calls are inserted into the code to track instance creation, and property changes. Those events are recorded and sent to the client or server on the next sync (i.e. at the end of the request, or at the end of some user-interface interaction).

The synchronization framework lets you build rich, interactive apps, with complex graph-based data models with little to no remote procedure calls, no data transfer objects, and with a statically typed protocol - for efficient errors and easily versioned protocols. The declarative patterns implemented by the sync framework support the most common uses cases - create, update, delete, lazily fetching of references and collections, and more. When your application needs more control, you can set the @Sync annotation on properties or types to fine-tune the behavior. If you still need more control, you can fall back to RPC on the same objects. Or start out with an RPC call, and use synchronization to apply or receive changes to the results. When you call a remote method using already-synchronized objects as parameters, those args are passed by reference making it much easier to build an API which works both locally or remotely.

The protocol between client and server is designed to be readable and organized - so when you change "firstName" and "lastName" of a User object, it comes across as two property changes of the User type. Serializers are implemented in JSON and SCN (a StrataCode layer which is converted to JS before being shipped to the client)

Read more in the documentation.