diff --git a/README.md b/README.md index bfaa86a..6e1f5e7 100755 --- a/README.md +++ b/README.md @@ -61,27 +61,44 @@
--- -### source code and snippets +### scripts and snippets
+#### services and pubs + * **[docker](code/docker)** * **[kubernetes](code/kubernetes):** * **[spin up a node server](code/kubernetes/node-server-example)** * **[kustomize for deployment](code/kubernetes/kustomize)** * **[python cdk for deployment](code/kubernetes/python-cdk)** +* **[kafka (long pooling)](code/kafka)** + +
+ +#### cloud + * **[aws](code/aws)** * **[gcp](code/gcp)** + +
+#### management + * **[chef](code/chef)** -* **[kafka](code/kafka)** + + +
+ +#### learning + * **[protocol demos](code/protocol_demos/)**
--- -### more resources +### external resources
diff --git a/communication/README.md b/communication/README.md index 2c5be34..b78495c 100644 --- a/communication/README.md +++ b/communication/README.md @@ -8,24 +8,24 @@ #### used in - - the web, HTTP, DNS, SSH - - RPC (remote procedure call) - - SQL and database protocols - - APIs (REST/SOAP/GraphQL) +- the web, HTTP, DNS, SSH +- RPC (remote procedure call) +- SQL and database protocols +- APIs (REST/SOAP/GraphQL)
#### the basic idea - 1. clients sends a request - - the request structure is defined by both client and server and has a boundary. - 2. server parses the request +1. clients sends a request + - the request structure is defined by both client and server and has a boundary +2. server parses the request - the parsing cost is not cheap (e.g. `json` vs. `xml` vs. protocol buffers) - for example, for a large image, chunks can be sent, with a request per chunk - 3. Server processes the request - 4. Server sends a response - 5. Client parse the Response and consume +3. Server processes the request +4. Server sends a response +5. Client parse the Response and consume
@@ -80,49 +80,50 @@ curl -v --trace marinasouza.xyz #### Synchronous I/O: the basic idea - 1. Caller sends a request and blocks - 2. Caller cannot execute any code meanwhile - 3. Receiver responds, Caller unblocks - 4. Caller and Receiver are in sync +1. Caller sends a request and blocks +2. Caller cannot execute any code meanwhile +3. Receiver responds, Caller unblocks +4. Caller and Receiver are in sync
##### example (note the waste!) - 1. program asks OS to read from disk - 2. program main threads is taken off the CPU - 3. read is complete and program resume execution (costly) +1. program asks OS to read from disk +2. program main threads is taken off the CPU +3. read is complete and program resume execution (costly)
#### Asynchronous I/O: the basic idea - 1. caller sends a request - 2. caller can work until it gets a response - 3. caller either: +1. caller sends a request +2. caller can work until it gets a response +3. caller either: - checks whether the response is ready (epoll) - receiver calls back when it's done (io_uring) - spins up a new thread that blocks - 4. caller and receiver not in sync +4. caller and receiver not in sync
#### Sync vs. Async in a Request Response - - synchronicity is a client property - - most modern client libraries are async +- synchronicity is a client property +- most modern client libraries are async
#### Async workload is everywhere - - async programming (promises, futures) - - async backend processing - - async commits in postgres - - async IO in Linux (epoll, io_uring) - - async replication - - async OS fsync (filesystem cache) + +- async programming (promises, futures) +- async backend processing +- async commits in postgres +- async IO in Linux (epoll, io_uring) +- async replication +- async OS fsync (filesystem cache)
@@ -134,24 +135,20 @@ curl -v --trace marinasouza.xyz #### pros and coins - - real time - - the client must be online (connected to the server) - - the client must be able to handle the load - - polling is preferred for light clients +- real-time +- the client must be online (connected to the server) +- the client must be able to handle the load +- polling is preferred for light clients. +- used by RabbitMQ (clients consume the queues, and the messages are pushed to the clients)
-#### basic idea +#### the basic idea - 1. client connects to a server - 2. server sends data to the client - 3. client doesn't have to request anything - 4. protocol must be bidirectional - -
-#### used in - - - RabbitMQ (clients consume the queues, and the messages are pushed to the clients) +1. client connects to a server +2. server sends data to the client +3. client doesn't have to request anything +4. protocol must be bidirectional
@@ -162,17 +159,18 @@ curl -v --trace marinasouza.xyz
-* used when a request takes long time to process (e.g., upload a video) and very simple to build. -* however, it can be too chatting, use too much network bandwidth and backend resources. +* used when a request takes long time to process (e.g., upload a video) and very simple to build +* however, it can be too chatting, use too much network bandwidth and backend resources -
-#### basic idea +
+ +#### the basic idea - 1. client sends a request - 2. server responds immediately with a handle - 3. server continues to process the request - 4. client uses that handle to check for status - 5. multiple short request response as polls +1. client sends a request +2. server responds immediately with a handle +3. server continues to process the request +4. client uses that handle to check for status +5. multiple short request response as polls
@@ -189,15 +187,15 @@ curl -v --trace marinasouza.xyz
-#### basic idea +#### the basic idea -
- 1. clients sends a request - 2. server responds immediately with a handle - 3. server continues to process the request - 4. client uses that handle to check for status - 5. server does not reply until has the response (and there are some timeouts) + +1. clients sends a request +2. server responds immediately with a handle +3. server continues to process the request +4. client uses that handle to check for status +5. server does not reply until has the response (and there are some timeouts)
@@ -209,19 +207,19 @@ curl -v --trace marinasouza.xyz
-* one request with a long response, but the client must be online and be able to handle the response. +* one request with a long response, but the client must be online and be able to handle the response
-#### basic idea +#### the basic idea - 1. a response has start and end - 2. client sends a request - 3. server sends logical events as part of response - 4. server never writes the end of the response - 5. it's still a request but an unending response - 6. client parses the streams data - 7. works with HTTP +1. a response has start and end +2. client sends a request +3. server sends logical events as part of response +4. server never writes the end of the response +5. it's still a request but an unending response +6. client parses the streams data +7. works with HTTP
@@ -229,6 +227,17 @@ curl -v --trace marinasouza.xyz ### Publish Subscribe (Pub/Sub) +
+ +* one publisher has many reader (and there can be many publishers) +* relevant when there are many servers (e.g., upload, compress, format, notification) +* great for microservices as it scales with multiple receivers +* loose coupling (clients are not connected to each other and works while clients not running) +* however, you cannot know if the consumer/subscriber got the message or got it twice, etc. +* also, it might result on network saturation and extra complexity +* used by RabbitQ and Kafka + +
@@ -236,6 +245,12 @@ curl -v --trace marinasouza.xyz ### Multiplexing vs. Demultiplexing +
+ + +* used by HTTP/2, QUIC, connection pool, MPTCP +* connection pooling is a technique where you can spin several backend connections and keep them "hot" +
@@ -244,11 +259,69 @@ curl -v --trace marinasouza.xyz ### Stateful vs. Stateless +
+ +* a very contentious topic: is state stored in the backend? how do you rely on the state of an application, system, or protocol? +* **stateful backend**: store state about clients in its memory and depends on the information being there +* **stateless backend**: client is responsible to "transfer the state" with every request (you may store but can safely lose it). + +
+ +#### Stateless backends + +* stateless backends can still store data somewhere else +* the backend remain stateless but the system is stateful (can you restart the backend during idle time while the client workflow continues to work?) + +
+ +#### Stateful backend + +* the server generate a session, store locally, and return to the user +* the client check if the session is in memory to authenticate and return +* if the backend is restarted, sessions are empty (it never relied on the databases) + +
+ +#### Stateless vs. Stateful protocols + +* the protocols can be designed to store date +* TCP is stateful: sequences, connection file descriptor +* UDP is stateless: DNS send queryID in UDP to identify queries +* QUIC is stateful but because it sends connectionID to identify connection, it transfer the state across the protocol +* you can build a stateless protocol on top of a stateful one and vice versa (e.g., HTTP on top of TCP, with cookies) + +
+ +#### Complete stateless systems + +* stateless systems are very rare +* state is carried with every request +* a backend service that relies completely on the input +* **JWT (JSON Web Token)**, everything is in the token and you cannot mark it as invalid + + + +
--- ### Sidecar Pattern +
+ +* every protocol requires a library, but changing the library is hard as the app is entrenched to it and breaking changes backward compatibility +* sidecar pattern is the idea of delegating communication through a proxy with a rich library (and the client has a thin library) +* in this case, every client has a sidecar proxy +* pros: it's language agnostic, provides extra security, service discovery, caching. +* cons: complexity, latency + +
+ +#### Examples + +* service mesh proxies (Linkerd, Istio, Envoy) +* sidecar proxy container (must be layer 7 proxy) +
\ No newline at end of file