mirror of
https://git.anonymousland.org/anonymousland/synapse.git
synced 2025-01-13 19:29:24 -05:00
Opentracing Documentation (#5703)
* Opentracing survival guide * Update decorator names in doc * Doc cleanup These are all alterations as a result of comments in #5703, it includes mostly typos and clarifications. The most interesting changes are: - Split developer and user docs into two sections - Add a high level description of OpenTracing * newsfile * Move contributer specific info to docstring. * Sample config. * Trailing whitespace. * Update 5703.misc * Apply suggestions from code review Mostly just rewording parts of the docs for clarity. Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
This commit is contained in:
parent
a3e40bd5b4
commit
826e6ec3bd
1
changelog.d/5703.misc
Normal file
1
changelog.d/5703.misc
Normal file
@ -0,0 +1 @@
|
|||||||
|
Documentation for opentracing.
|
100
docs/opentracing.rst
Normal file
100
docs/opentracing.rst
Normal file
@ -0,0 +1,100 @@
|
|||||||
|
===========
|
||||||
|
OpenTracing
|
||||||
|
===========
|
||||||
|
|
||||||
|
Background
|
||||||
|
----------
|
||||||
|
|
||||||
|
OpenTracing is a semi-standard being adopted by a number of distributed tracing
|
||||||
|
platforms. It is a common api for facilitating vendor-agnostic tracing
|
||||||
|
instrumentation. That is, we can use the OpenTracing api and select one of a
|
||||||
|
number of tracer implementations to do the heavy lifting in the background.
|
||||||
|
Our current selected implementation is Jaeger.
|
||||||
|
|
||||||
|
OpenTracing is a tool which gives an insight into the causal relationship of
|
||||||
|
work done in and between servers. The servers each track events and report them
|
||||||
|
to a centralised server - in Synapse's case: Jaeger. The basic unit used to
|
||||||
|
represent events is the span. The span roughly represents a single piece of work
|
||||||
|
that was done and the time at which it occurred. A span can have child spans,
|
||||||
|
meaning that the work of the child had to be completed for the parent span to
|
||||||
|
complete, or it can have follow-on spans which represent work that is undertaken
|
||||||
|
as a result of the parent but is not depended on by the parent to in order to
|
||||||
|
finish.
|
||||||
|
|
||||||
|
Since this is undertaken in a distributed environment a request to another
|
||||||
|
server, such as an RPC or a simple GET, can be considered a span (a unit or
|
||||||
|
work) for the local server. This causal link is what OpenTracing aims to
|
||||||
|
capture and visualise. In order to do this metadata about the local server's
|
||||||
|
span, i.e the 'span context', needs to be included with the request to the
|
||||||
|
remote.
|
||||||
|
|
||||||
|
It is up to the remote server to decide what it does with the spans
|
||||||
|
it creates. This is called the sampling policy and it can be configured
|
||||||
|
through Jaeger's settings.
|
||||||
|
|
||||||
|
For OpenTracing concepts see
|
||||||
|
https://opentracing.io/docs/overview/what-is-tracing/.
|
||||||
|
|
||||||
|
For more information about Jaeger's implementation see
|
||||||
|
https://www.jaegertracing.io/docs/
|
||||||
|
|
||||||
|
=====================
|
||||||
|
Seting up OpenTracing
|
||||||
|
=====================
|
||||||
|
|
||||||
|
To receive OpenTracing spans, start up a Jaeger server. This can be done
|
||||||
|
using docker like so:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
docker run -d --name jaeger
|
||||||
|
-p 6831:6831/udp \
|
||||||
|
-p 6832:6832/udp \
|
||||||
|
-p 5778:5778 \
|
||||||
|
-p 16686:16686 \
|
||||||
|
-p 14268:14268 \
|
||||||
|
jaegertracing/all-in-one:1.13
|
||||||
|
|
||||||
|
Latest documentation is probably at
|
||||||
|
https://www.jaegertracing.io/docs/1.13/getting-started/
|
||||||
|
|
||||||
|
|
||||||
|
Enable OpenTracing in Synapse
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
OpenTracing is not enabled by default. It must be enabled in the homeserver
|
||||||
|
config by uncommenting the config options under ``opentracing`` as shown in
|
||||||
|
the `sample config <./sample_config.yaml>`_. For example:
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
opentracing:
|
||||||
|
tracer_enabled: true
|
||||||
|
homeserver_whitelist:
|
||||||
|
- "mytrustedhomeserver.org"
|
||||||
|
- "*.myotherhomeservers.com"
|
||||||
|
|
||||||
|
Homeserver whitelisting
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
The homeserver whitelist is configured using regular expressions. A list of regular
|
||||||
|
expressions can be given and their union will be compared when propagating any
|
||||||
|
spans contexts to another homeserver.
|
||||||
|
|
||||||
|
Though it's mostly safe to send and receive span contexts to and from
|
||||||
|
untrusted users since span contexts are usually opaque ids it can lead to
|
||||||
|
two problems, namely:
|
||||||
|
|
||||||
|
- If the span context is marked as sampled by the sending homeserver the receiver will
|
||||||
|
sample it. Therefore two homeservers with wildly different sampling policies
|
||||||
|
could incur higher sampling counts than intended.
|
||||||
|
- Sending servers can attach arbitrary data to spans, known as 'baggage'. For safety this has been disabled in Synapse
|
||||||
|
but that doesn't prevent another server sending you baggage which will be logged
|
||||||
|
to OpenTracing's logs.
|
||||||
|
|
||||||
|
==================
|
||||||
|
Configuring Jaeger
|
||||||
|
==================
|
||||||
|
|
||||||
|
Sampling strategies can be set as in this document:
|
||||||
|
https://www.jaegertracing.io/docs/1.13/sampling/
|
@ -1422,18 +1422,8 @@ opentracing:
|
|||||||
#enabled: true
|
#enabled: true
|
||||||
|
|
||||||
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
||||||
#
|
# See docs/opentracing.rst
|
||||||
# Though it's mostly safe to send and receive span contexts to and from
|
# This is a list of regexes which are matched against the server_name of the
|
||||||
# untrusted users since span contexts are usually opaque ids it can lead to
|
|
||||||
# two problems, namely:
|
|
||||||
# - If the span context is marked as sampled by the sending homeserver the receiver will
|
|
||||||
# sample it. Therefore two homeservers with wildly disparaging sampling policies
|
|
||||||
# could incur higher sampling counts than intended.
|
|
||||||
# - Span baggage can be arbitrary data. For safety this has been disabled in synapse
|
|
||||||
# but that doesn't prevent another server sending you baggage which will be logged
|
|
||||||
# to opentracing logs.
|
|
||||||
#
|
|
||||||
# This a list of regexes which are matched against the server_name of the
|
|
||||||
# homeserver.
|
# homeserver.
|
||||||
#
|
#
|
||||||
# By defult, it is empty, so no servers are matched.
|
# By defult, it is empty, so no servers are matched.
|
||||||
|
@ -48,18 +48,8 @@ class TracerConfig(Config):
|
|||||||
#enabled: true
|
#enabled: true
|
||||||
|
|
||||||
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
||||||
#
|
# See docs/opentracing.rst
|
||||||
# Though it's mostly safe to send and receive span contexts to and from
|
# This is a list of regexes which are matched against the server_name of the
|
||||||
# untrusted users since span contexts are usually opaque ids it can lead to
|
|
||||||
# two problems, namely:
|
|
||||||
# - If the span context is marked as sampled by the sending homeserver the receiver will
|
|
||||||
# sample it. Therefore two homeservers with wildly disparaging sampling policies
|
|
||||||
# could incur higher sampling counts than intended.
|
|
||||||
# - Span baggage can be arbitrary data. For safety this has been disabled in synapse
|
|
||||||
# but that doesn't prevent another server sending you baggage which will be logged
|
|
||||||
# to opentracing logs.
|
|
||||||
#
|
|
||||||
# This a list of regexes which are matched against the server_name of the
|
|
||||||
# homeserver.
|
# homeserver.
|
||||||
#
|
#
|
||||||
# By defult, it is empty, so no servers are matched.
|
# By defult, it is empty, so no servers are matched.
|
||||||
|
@ -24,6 +24,131 @@
|
|||||||
# this move the methods have work very similarly to opentracing's and it should only
|
# this move the methods have work very similarly to opentracing's and it should only
|
||||||
# be a matter of few regexes to move over to opentracing's access patterns proper.
|
# be a matter of few regexes to move over to opentracing's access patterns proper.
|
||||||
|
|
||||||
|
"""
|
||||||
|
============================
|
||||||
|
Using OpenTracing in Synapse
|
||||||
|
============================
|
||||||
|
|
||||||
|
Python-specific tracing concepts are at https://opentracing.io/guides/python/.
|
||||||
|
Note that Synapse wraps OpenTracing in a small module (this one) in order to make the
|
||||||
|
OpenTracing dependency optional. That means that the access patterns are
|
||||||
|
different to those demonstrated in the OpenTracing guides. However, it is
|
||||||
|
still useful to know, especially if OpenTracing is included as a full dependency
|
||||||
|
in the future or if you are modifying this module.
|
||||||
|
|
||||||
|
|
||||||
|
OpenTracing is encapsulated so that
|
||||||
|
no span objects from OpenTracing are exposed in Synapse's code. This allows
|
||||||
|
OpenTracing to be easily disabled in Synapse and thereby have OpenTracing as
|
||||||
|
an optional dependency. This does however limit the number of modifiable spans
|
||||||
|
at any point in the code to one. From here out references to `opentracing`
|
||||||
|
in the code snippets refer to the Synapses module.
|
||||||
|
|
||||||
|
Tracing
|
||||||
|
-------
|
||||||
|
|
||||||
|
In Synapse it is not possible to start a non-active span. Spans can be started
|
||||||
|
using the ``start_active_span`` method. This returns a scope (see
|
||||||
|
OpenTracing docs) which is a context manager that needs to be entered and
|
||||||
|
exited. This is usually done by using ``with``.
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from synapse.logging.opentracing import start_active_span
|
||||||
|
|
||||||
|
with start_active_span("operation name"):
|
||||||
|
# Do something we want to tracer
|
||||||
|
|
||||||
|
Forgetting to enter or exit a scope will result in some mysterious and grievous log
|
||||||
|
context errors.
|
||||||
|
|
||||||
|
At anytime where there is an active span ``opentracing.set_tag`` can be used to
|
||||||
|
set a tag on the current active span.
|
||||||
|
|
||||||
|
Tracing functions
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Functions can be easily traced using decorators. There is a decorator for
|
||||||
|
'normal' function and for functions which are actually deferreds. The name of
|
||||||
|
the function becomes the operation name for the span.
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from synapse.logging.opentracing import trace, trace_deferred
|
||||||
|
|
||||||
|
# Start a span using 'normal_function' as the operation name
|
||||||
|
@trace
|
||||||
|
def normal_function(*args, **kwargs):
|
||||||
|
# Does all kinds of cool and expected things
|
||||||
|
return something_usual_and_useful
|
||||||
|
|
||||||
|
# Start a span using 'deferred_function' as the operation name
|
||||||
|
@trace_deferred
|
||||||
|
@defer.inlineCallbacks
|
||||||
|
def deferred_function(*args, **kwargs):
|
||||||
|
# We start
|
||||||
|
yield we_wait
|
||||||
|
# we finish
|
||||||
|
defer.returnValue(something_usual_and_useful)
|
||||||
|
|
||||||
|
Operation names can be explicitly set for functions by using
|
||||||
|
``trace_using_operation_name`` and
|
||||||
|
``trace_deferred_using_operation_name``
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from synapse.logging.opentracing import (
|
||||||
|
trace_using_operation_name,
|
||||||
|
trace_deferred_using_operation_name
|
||||||
|
)
|
||||||
|
|
||||||
|
@trace_using_operation_name("A *much* better operation name")
|
||||||
|
def normal_function(*args, **kwargs):
|
||||||
|
# Does all kinds of cool and expected things
|
||||||
|
return something_usual_and_useful
|
||||||
|
|
||||||
|
@trace_deferred_using_operation_name("Another exciting operation name!")
|
||||||
|
@defer.inlineCallbacks
|
||||||
|
def deferred_function(*args, **kwargs):
|
||||||
|
# We start
|
||||||
|
yield we_wait
|
||||||
|
# we finish
|
||||||
|
defer.returnValue(something_usual_and_useful)
|
||||||
|
|
||||||
|
Contexts and carriers
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
There are a selection of wrappers for injecting and extracting contexts from
|
||||||
|
carriers provided. Unfortunately OpenTracing's three context injection
|
||||||
|
techniques are not adequate for our inject of OpenTracing span-contexts into
|
||||||
|
Twisted's http headers, EDU contents and our database tables. Also note that
|
||||||
|
the binary encoding format mandated by OpenTracing is not actually implemented
|
||||||
|
by jaeger_client v4.0.0 - it will silently noop.
|
||||||
|
Please refer to the end of ``logging/opentracing.py`` for the available
|
||||||
|
injection and extraction methods.
|
||||||
|
|
||||||
|
Homeserver whitelisting
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
Most of the whitelist checks are encapsulated in the modules's injection
|
||||||
|
and extraction method but be aware that using custom carriers or crossing
|
||||||
|
unchartered waters will require the enforcement of the whitelist.
|
||||||
|
``logging/opentracing.py`` has a ``whitelisted_homeserver`` method which takes
|
||||||
|
in a destination and compares it to the whitelist.
|
||||||
|
|
||||||
|
=======
|
||||||
|
Gotchas
|
||||||
|
=======
|
||||||
|
|
||||||
|
- Checking whitelists on span propagation
|
||||||
|
- Inserting pii
|
||||||
|
- Forgetting to enter or exit a scope
|
||||||
|
- Span source: make sure that the span you expect to be active across a
|
||||||
|
function call really will be that one. Does the current function have more
|
||||||
|
than one caller? Will all of those calling functions have be in a context
|
||||||
|
with an active span?
|
||||||
|
"""
|
||||||
|
|
||||||
import contextlib
|
import contextlib
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
|
Loading…
Reference in New Issue
Block a user