Summary
Designing for network latency is seldom a first-order principle of system architecture, a mistake that can undo your design, according to eBay architect and technical Fellow Dan Pritchett. In a recent blog post, Pritchett outlines four steps to make latency a part of your design from the outset.
For those who design systems primarily for deployment within a single data center, this fallacy may not mean much: Networking technologies employed in data centers today, such as affordable Gigabit Ethernet, have steadily helped reduce latency to a point where latency in that environment may safely be ignored for a while.
However, most critical applications must be made redundant and deployed possibly across multiple data centers. Multiple data centers can also bring an application closer to its users, especially when an application serves a global customer base as, for instance, eBay does.
In a recent blog post, Latency Exists, Cope!, eBay architect and technical Fellow Dan Pritchett drives home the importance of considering latency from the outset when architecting a scalable system:
Latency is a critical part of every system architecture. Yet making latency a first order constraint in the architecture is not that common. The result are systems that become heavily influenced by the distance between deployments and limit the business's ability to serve their customers effectively and protect itself against localized disasters...
Latency is another example of what you don't take into consideration in your architecture will ultimately undo your design. It is one of the more difficult constraints to design for correctly.
He highlights four key principles when architecting a system with latency in mind:
Good Decomposition Allowing components with little functional overlap to be coupled either in code or during deployment will pretty much kill any hope distributing your architecture across a collection of global data centers. Do it badly enough and you will kill any hope of distributing your architecture across two cities in the same state.
Asynchronous Interactions Companies get tripped up here by exposing an early version of an interface that sets the clients expectation of synchronous, low latency interactions. As the interface becomes more heavily used it becomes more and more difficult to change that semantic.
Monolithic Data You have to tackle your persistence model early in your architecture and require that data can be split along both functional and scale vectors or you will not be able to distribute your architecture across geographies.
Design for Active/Active Service your customers from all of your locations simultaneously. This is a more efficient and responsive approach than an active/passive pattern where only one location is serving traffic at a time. Utilization of your resources will be higher and by placing services nearer your customers, you are better meeting their needs as well.
How early in your design do you take network latency into account?