Skip to content

Commit 1ea1f3c

Browse files
committed
Completed guidelines with additional information
1 parent dcfca7b commit 1ea1f3c

File tree

8 files changed

+262
-14
lines changed

8 files changed

+262
-14
lines changed

Diff for: assets/eda_problem_statement_1.png

249 KB
Loading

Diff for: assets/eda_problem_statement_2.png

209 KB
Loading

Diff for: assets/sr_backward_compatibility.png

16.7 KB
Loading

Diff for: assets/sr_forward_compat.png

17.3 KB
Loading

Diff for: assets/sr_full_compat.png

18 KB
Loading

Diff for: asynchronous-api-guidelines/01_introduction/b_basic_concepts.md

+162-5
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,173 @@
44

55
### Event-driven architectures
66

7-
An Event-Driven Architecture (EDA) uses events to trigger and communicate between services and is common in modern applications built with microservices. An event is a change in state, or an update, like adding a shopping item in a cart on an e-commerce website.
7+
#### What is an event-driven architecture
8+
9+
Event-Driven Architectures (EDAs) are a paradigm that promotes the production, consumption and reaction to events.
10+
11+
This architectural pattern may be applied by the design and implementation of applications and systems that transmit events amongst loosely coupled software components and services.
12+
13+
An event-driven system typically consists of event emitters (or agents), event consumers (or sinks), and event channels.
14+
15+
- Producers (or publishers) are responsible for detecting, gathering and transferring events
16+
- Are not aware of consumers
17+
- Are not aware of how the events are consumed
18+
- Consumers (or subscribers) react to the events as soon as they are produced
19+
- The reaction can be self-contained or it can be a composition of processes or components
20+
- Event channels are conduits in which events are transmited from emitters to consumers
21+
22+
**Note** Producer and Consumer role is not exclusive. In other words, the same client or application can be producer and consumer at the same time.
23+
24+
25+
In most cases, EDAs are broker-centric, as seen in the diagram below.
826

927
![EDA overview](../../assets/eda_overview.png)
1028

11-
In most cases, EDAs are broker-centric, as seen in the diagram above. There are some new concepts in that diagram, so let's go through them now.
29+
#### Problem statement
30+
31+
Typically, the architectural landscape of a big company grows in complexity and as a result of that it is possible to end up with a bunch of direct connections between a myriad of different components or modules.
32+
33+
![EDA overview](../../assets/eda_problem_statement_1.png)
34+
35+
By using streaming patterns, it is possible to get a much cleaner architecture
36+
37+
![EDA overview](../../assets/eda_problem_statement_2.png)
38+
39+
40+
It is important to take into account that EDAs are not a silver bullet, and there are situations in which this kind of architectures might not fit very well.
41+
42+
One example is systems that heavily rely on transactional operations... of course it might be possible to use EDA but most probably the complexity of the resulting architecture would be too high.
43+
44+
Also, it is important to note that it is possible to mix request-driven and event-driven protocols in the same system. For example,
45+
46+
- Online services that interact directly with a user fits better into the synchronous communication but they also can generate events into Kafka.
47+
- On the other hand, offline services (billing, fulfillment, etc) are typically built purely with events.
48+
49+
#### Kafka as the heard of EDAs
50+
51+
There are several technologies to implement event-driven architectures, but this section is going to focus on the predominant technology on this subject : Apache Kafka.
52+
53+
**Apache Kafka** can be considered as a Streaming Platform which relies on the several concepts:
54+
55+
- Super high-performance, scalable, highly-available cluster of brokers
56+
- Availability
57+
- Replication of partitions across different brokers
58+
- Scalability
59+
- Partitions
60+
- Ability to rebalance partitions across consumers automatically when adding/removing them
61+
- Performance
62+
- Partitioned, replayable log (collection of messages appended sequentially to a file)
63+
- Data copied directly from disk buffer to network buffer (zero copy) without even being imported to the JVM
64+
- Extreme throughput by using the concept of consumer group and
65+
- Security
66+
- Secure encrypted connections using TLS client certificates
67+
- Multi-tenant management through quotas/acls
68+
- Client APIs on different programming languages : Go, Scala, Python, REST, JAVA, ...
69+
- Stream processing APIs (currently Kafka Streams and ksqlDB)
70+
- Ecosystem of connectors to pull/push data from/to Kafka
71+
- Clean-up processes for storage optimization
72+
- Retention periods
73+
- Compacted topics
74+
75+
### Basic terminology
76+
77+
#### Events
78+
79+
An event is both a fact and a notification, something that already happened in the real world.
80+
81+
- No expectation on any future action
82+
- Includes information about a status change that just happened
83+
- Travels in one direction and it never expects a response (fire and forget)
84+
- Very useful when...
85+
- Loose coupling is important
86+
- When the same piece of information is used by several services
87+
- When data needs to be replicated across application
88+
89+
A message in general is any interaction between an emitter and a receiver to exchange information. This implies that any event can be considered a messages but not the other way around.
90+
91+
#### Commands
92+
93+
A command is a special type of message which represents just an action, something that will change the state of a given system.
94+
95+
- Typically synchronous
96+
- There is a clear expectation about a state change that needs to take place in the future
97+
- When returning a response indicate completion
98+
- Optionally they can include a result in the response
99+
- Very common to see them in orchestration components
100+
101+
#### Query
102+
103+
It is a special type of message which represents a request to look something up.
12104

13-
In Event-Driven Architecture (EDA), an application must be a producer, a consumer, or both. Applications must also use the protocols the server supports if they wish to connect and exchange messages.
105+
- They are always free of side effects (leaves the system unchanged)
106+
- They always require a response (with the requested data)
14107

15-
### Messages and events
108+
#### Coupling
109+
110+
The term coupling can be understood as the impact that a change in one component will have on other components. In the end, it is related to the amount of things that a given component shares with others. The more is shared, the more tight is the coupling.
111+
112+
**Note** A tighter coupling is not necessarily a bad thing, it depends on the situation. It will be necessary to assess the tradeoff between provide as much information as possible and to avoid having to change several components as a result of something changing in other component.
113+
114+
The coupling of a single component is actually a function of these factors:
115+
116+
- Information exposed (Interface surface area)
117+
- Number of users
118+
- Operational stability and performance
119+
- Frequency of change
120+
121+
Messaging helps bulding loosely coupled services because it moves pure data from a highly coupled location (the source) and puts it into a loosely coupled location (the subscriber).
122+
123+
Any operations that need to be performed on the data are done in each subscriber and never at the source. This way, messaging technologies (like Kafka) take most of the operational issues off the table.
124+
125+
All business systems in larger organizations need a base level of essential data coupling. In other words, functional couplings are optional, but core data couplings are essential.
126+
127+
#### Bounded context
128+
129+
A bounded context is a small group of services that share the same domain model, are usually deployed together and collaborate closely.
130+
131+
It is possible to put an analogy here with a hierarchic organization inside a company :
132+
133+
- Different departments are loosely coupled
134+
- Inside departments there will be a lot more interactions across services and the coupling will be tighter
135+
136+
One of the big ideas of Domain-Driven Design (DDD) was to create boundaries around areas of a business domain and model them separately. So within the same bounded context the domain model is shared and everything is available for everyone there.
137+
138+
However, different bounded contexts don't share the same model and if they need to interact they will do it through more restricted interfaces.
139+
140+
#### Stream processing
141+
142+
It can be understood as the capability of processing data directly as it is produced or received (hence, in real-time or near to real-time).
143+
144+
[Review]
16145

17146
A message carries information from one application to another, while an event is a message that provides details of something that has already occurred. One important aspect to note is that depending on the type of information a message contains, it can fall under an event, query, or command.
18147

19-
Overall, events are messages but not all messages are events.
148+
Overall, events are messages but not all messages are events.
149+
150+
### Using events in an EDA
151+
152+
There are several ways to use events in a EDA:
153+
154+
- Events as notifications
155+
- Events to replicate data
156+
157+
158+
#### Events as notifications
159+
160+
When a system uses events as notifications it becomes a pluggable system. The producers have no knowledge about the consumers and they don't really care about them, instead every consumer can decide if it is interested in the information included in the event.
161+
162+
This way, the number of consumers can be increased (or reduced) without changing anything on the producer side.
163+
164+
This pluggability becomes increasily important as systems get more complex.
165+
166+
#### Events to replicate data
167+
168+
When events are used to replicate data across services, they include all the necessary information for the target system to keep it locally so that it can be queried with no external interactions.
169+
170+
This is usually called event-carried state transfer which in the end is a form of data integration.
171+
172+
The benefits are similar to the ones implied by the usage of a cache system
173+
174+
- Better isolation and autonomy, as the data stays under service's control
175+
- Faster data access, as the data is local (particularly important when combining data from different services in different geographies)
176+
- Offline data availability

Diff for: asynchronous-api-guidelines/02_asynchronous_api_guidelines/main.md

+97-8
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,13 @@
22

33
## Asynchronous API guidelines
44

5+
This document is biased towards Kafka, which is the technology used in adidas for building Event Driven Architectures.
6+
57
### Contract
68

7-
Approved API Design, represented by its API Description or schema, **MUST** represent the contract between API stakeholder, implementers, producers and consumers.
9+
The definition of an asynchronous API **MUST** represent a contract between API owners and the stakeholders.
810

9-
That contract **MUST** contain enough information to use the API (servers, URIs, credentials, contact information, etc).
11+
That contract **MUST** contain enough information to use the API (servers, URIs, credentials, contact information, etc) and to identify which kind of information is being exchanged there.
1012

1113
### API First
1214

@@ -36,11 +38,25 @@ The API types **MUST** adhere to the formats defined below:
3638
| Country Code | [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) | DE <-> Germany |
3739
| Currency | [ISO 4217](https://en.wikipedia.org/wiki/ISO_4217) | EUR <-> Euro |
3840

41+
### Automatic schema registration
42+
43+
Applications **MUST NOT** enable automatic registration of schemas because FDP's operational model for the Schema Registry relies on GitOps (every operation is done through GIT PRs + automated pipelines)
44+
3945
### Schemas and data evolution
4046

4147
All asynchronous APIs **SHOULD** leverage Schema Registry to ensure consistency across consumers/producers with regards to message structure and ensuring compatibility across different versions.
4248

43-
The default compatibility mode in Schema Registry is FULL_TRANSITIVE. This is the more restrictive compatibility mode, but others are also available.
49+
The default compatibility mode in Schema Registry is FULL_TRANSITIVE. This is the more restrictive compatibility mode, but others are also available. More on this on the subsection below.
50+
51+
#### Compatibility modes
52+
53+
Once a given schema is defined, it is unavoidable that the schema evolves with time. Everytime this happens, downstream consumers need to be able to handle data with both old and new schemas seamlessly.
54+
55+
Each new schema version is validated according to the configuration before being created as a new version. Namely, it is checked against the configured compatibility types (see below).
56+
57+
**Important** The mere fact of enabling Schema Registry is not enough to ensure that there are no compatibility issues in a given integration. The right compatibility mode needs also to be selected and enforced.
58+
59+
As a summary, the available compatibility types are listed below:
4460

4561
| Mode | Description |
4662
|------|-------------|
@@ -52,20 +68,93 @@ The default compatibility mode in Schema Registry is FULL_TRANSITIVE. This is th
5268
|FULL_TRANSITIVE|both backward and forward compatibility with all schema versions|
5369
|NONE|schema compatibility checks are disabled|
5470

55-
If for any reason you need to use a less strict compatibility mode in a topic, that compatibility mode **SHOULD NOT** be modified on the same topic. Instead, a new topic **SHOULD** be used to avoid unexpected behaviors or broken integrations.
71+
(info) To help visualizing these concepts, consider the flow of compatibility from the perspective of the consumer
5672

57-
Applications **MUST NOT** enable automatic registration of schemas because FDP's operational model for the Schema Registry relies on GitOps (every operation is done through GIT PRs + automated pipelines)
73+
#### Backward compatibility
74+
75+
There are two variants here:
76+
77+
- BACKWARD - Consumers using a new version (X) of a schema can read data produced by the previous version (X - 1)
78+
- BACKWARD_TRANSITIVE - Consumers using a new version (X) of a schema can read data produced by any previous version (X - 1, X - 2, ....)
79+
80+
The operations that preserve backward compatibility are:
81+
82+
- Delete fields
83+
- Consumers with the newer version will just ignore the non-existing fields
84+
- Add optional fields (with default values)
85+
- Consumers will set the default value for the missing fields in their schema version
86+
87+
![sr_backward](../../assets/sr_backward_compatibility.png)
88+
89+
#### Forward compatibility
90+
91+
Also two variants here:
92+
93+
- FORWARD - Consumers with previous version of the schema (X - 1) can read data produced by Producers with a new schema version (X)
94+
- FORWARD_TRANSITIVE - Consumers with any previous version of the schema (X - 1, X - 2, ...) can read data produced by Producers with a new schema version (X)
5895

59-
Please refer to [Kafka_Schema_Registry-Default_Requirements](https://confluence.tools.3stripes.net/display/FDP/Kafka_Schema_Registry-Default_Requirements) for more information about Schema Registry.
96+
The operations that preserve forward compatibility are:
97+
98+
- Adding a new field
99+
- Consumers will ignore the fields that are not defined in their schema version
100+
- Delete optional fields (with default values)
101+
- Consumers will use the default value for the missing fields defined in their schema version
102+
103+
![sr_forward](../../assets/sr_forward_compat.png)
104+
105+
#### Full compatibility
106+
107+
This is a combination of both compatibility types (backward and forward). It also has 2 variants:
108+
109+
- FULL - Backward and forward compatible between schemas X and X - 1.
110+
- FULL_TRANSITIVE - Backward and forward compatible between schemas X and all previous ones (X - 1, X - 2, ...)
111+
112+
113+
**Important** FULL_TRANSITIVE is the default compatibility mode in FDP, it is set at cluster level and all new schemas will inherit it
114+
115+
This mode is preserved only if using the following operations
116+
117+
- Adding optional fields (with default values)
118+
- Delete optional fields (with default values)
119+
120+
#### Upgrading process of clients based on compatibility
121+
122+
Depending on the compatibility mode, the process of upgrading producers/consumers will be different based on the compatibility mode enabled.
123+
124+
- NONE
125+
- As there are no compatibility checks, no order will grant a smooth transition
126+
- In most of the cases this lead to having to create a new topic for this evolution
127+
- BACKWARD / BACKWARD_TRANSITIVE
128+
- Consumers **MUST** be upgraded first before producing new data
129+
- No forward compatibility, meaning that there's no guarantee that the consumers with older schemas are going to be able to read data produced with a new version
130+
- FORWARD / FORWARD_TRANSITIVE
131+
- Producers **MUST** be upgraded first and then after ensuring that no older data is present, upgrade the consumers
132+
- No backward compatibility, meaning that there's no guarantee that the consumers with newer schemas are going to be able to read data produced with an older version
133+
- FULL / FULL TRANSITIVE
134+
- No restrictions on the order, anything will work
135+
136+
#### How to deal with breaking changes
137+
138+
If for any reason you need to use a less strict compatibility mode in a topic, or you can't avoid breaking changes in a given situation, the compatibility mode **SHOULD NOT** be modified on the same topic.
139+
140+
Instead, a new topic **SHOULD** be used to avoid unexpected behaviors or broken integrations. This allows a smooth transitioning from clients to the definitive topic, and once all clients are migrated the original one can be decommissioned.
141+
142+
Alternatively, instead of modifying existing fields it **MAY** be considered as an suboptimal approach to add the changes in new fields and have both coexisting. Take into account that this pollutes your topic and it can cause some confusion.
60143

61144
### Key/Value message format
62145

63-
Kafka messages **MAY** include a key, which needs to be properly designed to have a good partition balanceare key-value pairs.
146+
Kafka messages **MAY** include a key, which needs to be properly designed to have a good balance of data across partitions.
64147

65-
The message key and the payload (often called value) can be serialized independently and can have different formats. For example, the payload of the message can be sent in AVRO format, while the message key can be a primitive type (string). 
148+
The message key and the payload (often called value) can be serialized independently and can have different formats. For example, the value of the message can be sent in AVRO format, while the message key can be a primitive type (string). 
66149

67150
Message keys **SHOULD** be kept as simple as possible and use a primitive type when possible.
68151

152+
### Message headers
153+
154+
In addition to the key and value, a Kafka message **MAY** include ***headers***, which allow to extend the information sent with some metadata as needed (for example, source of the data, routing or tracing information or any relevant information that could be useful without having to parse the message).
155+
156+
Headers are just an ordered collection of key/value pairs, being the key a String and the value a serialized Object, the same as the message value itself.
157+
69158
### Naming conventions
70159

71160
As general naming conventions, asynchronous APIs **MUST** adhere to the following conventions

Diff for: asynchronous-api-guidelines/03_asyncapi_kafka_specs/b_guidelines.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,9 @@ All AsyncAPI specs **SHOULD** include as much information as needed in order to
4444

4545
AsyncAPI specs **MUST** include at least one main contact under the info.contact section.
4646

47-
The spec only allows to include one contact there, but it **MAY** also include additional contacts using extension fields. For example:
47+
The spec only allows to include one contact there, but it **MAY** also include additional contacts using extension fields. In case this is done, it **MUST** use the extension field *x-additional-responsibles*.
48+
49+
For example:
4850

4951
```yaml
5052
...

0 commit comments

Comments
 (0)