|
There is client-server software, and there is peer-to-peer software. Are
these two architectures all we ever need for distributed software?
I'd like to suggest that some of the world's most successful distributed software
architectures are neither client-server nor peer-to-peer when you look
at them closely. They follow an architectural pattern that I'd like to
call the 4-Point Architecture.
Consider e-mail, a distributed and network-centric application if there
ever was one. To accurately describe how e-mail gets sent from my computer to
your computer, we need four computers in the picture (the
"four points" of this architecture):

In this case, My Computer first hands the e-mail message to my SMTP server
through one of several possible protocols (e.g. SMTP, POP, ...). The SMTP
server then uses SMTP to send the message to your SMTP server, which
in turn holds the message until Your Computer downloads it using a
protocol such as POP.
In this architecture, we have elements of both peer-to-peer and client-server:
either SMTP server can initiate communication to another SMTP server, which
makes them peer-to-peer. (However, SMTP is not a symmetrical protocol, which
means we run client-server on top of peer-to-peer.) The relationship between
an SMTP server and its client computer is clearly a client-server
relationship.
If we redraw the picture in a more generic manner, we arrive at this: (and
I will get to the labeling in a second)

The 4-Point Architecture is also used by the following technologies:
- Jabber/XMPP
presence and instant messaging. Jabber clients log into Jabber servers using
an asymmetrical protocol. Jabber servers talk to each other
using a symmetrical protocol. Same picture.
- Modern P2P file sharing networks. It turns out that virtually all of
them distinguish between regular nodes and super nodes. While the
communication between nodes may allow for an entirely peer-to-peer
form of communication, in practice only subsets of this protocol are
used, making peer-super-peer relationships client-server and
super-peer-super-peer relationships peer-to-peer. While dynamic
reassignment of roles may take place, they still use the 4-Point
Architecture at any point in time.
- Update April 25: Of course, blogs work the same way as well: we
edit locally on our PCs but publish on visible servers with a defined
address.
- There are lesser-known ones as well, including what we have in our product.
We could call this architecture a mix of client-server and peer-to-peer.
However, it comes with some rules that cannot adequately be described
without looking at all 4 points of the architecture:
- The computers in the first row must be addressable and routable by any other
computer in the first row. As a result, they must belong to the
"bright" internet (i.e. those parts of the internet that
can be reliably addressed and found)
- No computer in the second row needs to be addressable and routable.
To communicate, they must be able to initiate a connection to a bright
point, but there is no requirement that a bright node initiate a
connection on its own. Therefore, the computers in the second row
may belong to the "dark" internet (i.e. those parts of the
internet that may have rapidly changing IP addresses, are moved quickly
from network to network, such as laptops, and generally do not have
a well-defined DNS name).
- Any dark point only interacts with exactly one bright point (at least for
a given application or protocol). It is extremely rare that a dark point
interacts with
more than one bright point (for a given protocol) for reasons other than
basic availability of bright points. Thus the relationship between
a bright point and its dark points is a fairly stable one.
- No dark point ever interacts directly with another dark point, they
always go through their respective bright points first.
Why do I think this 4-Point Architecture is important? It's important because
it is a generally-useful,
proven architecture for distributed applications that includes all kinds of
devices, not just "bright" servers. Some of its benefits are:
- It clearly assigns responsibilities: bright points must be stable and
available, for example, while dark points need not be.
- It enables reliable communication between unreliable dark points that may not
even be able to
route to each other directly, regardless how much work one was willing
to put in.
- It allows organizations to specialize: operating a bright point requires
a different set of expertise than operating a dark point.
- It allows local innovation: if somebody invents IMAP, they can install it
locally (between their bright and dark points), without impacting the
rest of the network.
- It provides clear guidance where to attach logging, approval, quota
and so forth functionalities and procedures.
- It reduces the requirements on the dark nodes, which are often relatively
underpowered devices (e.g. cell phones can send e-mail without having
to support the entire SMTP protocol).
Watch the 4-Point Architecture to become more prominent going forward in many
places ... it simply makes too much sense not to.
|