{"id":729,"date":"2022-06-15T17:55:08","date_gmt":"2022-06-15T07:55:08","guid":{"rendered":"https:\/\/sysmit.com\/cf22\/?p=729"},"modified":"2023-12-13T15:28:02","modified_gmt":"2023-12-13T05:28:02","slug":"jaeger-tracing-software-observability","status":"publish","type":"post","link":"https:\/\/sysmit.com\/cf22\/jaeger-tracing-software-observability\/","title":{"rendered":"How Jaeger tracing fits into software observability"},"content":{"rendered":"\n<p>In this article, I will share how tracing and more specifically Jaeger tracing can fit into your wider software observability strategy. <\/p>\n\n\n\n<p>Before we get into tracing, let&#8217;s define observability.<\/p>\n\n\n<h2 class=\"gb-headline gb-headline-0aaf3534 gb-headline-text\" id=\"what-is-observability\">What is observability?<\/h2>\n\n\n<p>Observability is a comprehensive means of gaining data on how software services perform in production. <\/p>\n\n\n\n<p>This data gives you <strong>a picture of the health and performance of individual services<\/strong>, as well as the cloud infrastructure that supports them. <\/p>\n\n\n\n<p>It can be broken down into 3 actions: logging, tracing, and monitoring. Our focus in this article will be on tracing. <\/p>\n\n\n<h2 class=\"gb-headline gb-headline-43525e1f gb-headline-text\" id=\"what-is-tracing\">What is tracing?<\/h2>\n\n\n<p>Tracing is an action that <strong>tracks a request from initiation to completion<\/strong> within a microservices architecture. <\/p>\n\n\n\n<p>It usually starts when a user or service starts a request which moves along a chain of interconnected services needed to fulfill the request. <\/p>\n\n\n\n<p>With tracing enabled, software engineers and SREs can pinpoint any issues within the chain of requests among the various involved services. <\/p>\n\n\n<h2 class=\"gb-headline gb-headline-29e4c8aa gb-headline-text\" id=\"where-jaeger-fits-into-the-tracing-paradigm\">Where Jaeger fits into the tracing paradigm<\/h2>\n\n<h3 class=\"gb-headline gb-headline-e4a6f747 gb-headline-text\" id=\"what-is-jaeger-tracing\">What is Jaeger tracing?<\/h3>\n\n\n<p>Jaeger is an open-source tracing tool that allows engineers to <strong>track request performance and issues among 10s, 100s, and even 1000s of services<\/strong> and their dependencies. It<strong> <\/strong>collects tracing data and then populates Grafana dashboards.<\/p>\n\n\n\n<p>The key benefit of this is that it highlights downtime\/load-time risks and errors. This makes it an essential component of a strong observability practice. <\/p>\n\n\n<h3 class=\"gb-headline gb-headline-cf174d1c gb-headline-text\" id=\"jaegers-origin-story\">Jaeger&#8217;s origin story <\/h3>\n\n\n<p>Jaeger was created in 2015 by an engineer at Uber, Yuri Shkuro, who wanted to help engineers work out&nbsp;<em>where<\/em>&nbsp;issues were popping up. This emerged as a critical need at Uber over time.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"480\" height=\"270\" src=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/uber-microservices-map.png\" alt=\"Glimpse of microservices that drive the Uber app. A large number of these services get triggered every time you request an Uber ride.\" class=\"wp-image-733\" srcset=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/uber-microservices-map.png 480w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/uber-microservices-map-300x169.png 300w\" sizes=\"(max-width: 480px) 100vw, 480px\" \/><figcaption class=\"wp-element-caption\">Above: a glimpse of services that support the Uber app. Many of these services get triggered every time you request an Uber ride.&nbsp;<em>(Source: Youtube,&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/youtu.be\/UNqilb9_zwY?t=185\" target=\"_blank\">Jaeger Intro \u2013 Yuri Shkuro<\/a>)<\/em><\/figcaption><\/figure>\n\n\n\n<p>The Uber app may seem simple to its end users, but behind the facade runs a complex network of microservices. Many of these services depend on other services and their sub-services.<\/p>\n\n\n\n<p>Weaknesses in the service chain can risk the whole user request falling apart i.e. no ride. <\/p>\n\n\n\n<p>In business terms, Uber risks losing ride fares at a large scale if one or some component services fail or slow down.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote has-medium-font-size\" id=\"need-jaeger\">\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><em>\u201cIn deep distributed systems, finding&nbsp;<\/em><strong>what<\/strong><em>&nbsp;is broken and&nbsp;<\/em><strong>where<\/strong><em>&nbsp;is often more difficult than&nbsp;<\/em><strong>why<\/strong><em>\u201c<\/em><\/p>\n<cite>\u2014 Yuri Skhuro, Founder &amp; Maintainer, CNCF Jaeger<\/cite><\/blockquote>\n\n\n\n<p>Jaeger tracing helps engineers find out what services are experiencing issues and where. That way, they can fix small issues before they snowball into serious problems or crises.<\/p>\n\n\n<h3 class=\"gb-headline gb-headline-e59b566b gb-headline-text\" id=\"do-your-observability-needs-justify-using-jaeger\">Do your observability needs justify using Jaeger?<\/h3>\n\n\n<p>You might be wondering whether you even need Jaeger.  After all, your use case might not be as complex as Uber\u2019s. Jaeger was designed to <strong>make sense of a complex web of services and up to millions of daily requests<\/strong>.<\/p>\n\n\n\n<p>Tracing is not an absolute must-have for simpler software architectures. However, it is useful for finding bottlenecks if you have more than a handful of services. Having more than 10 services is a fair threshold of need.<\/p>\n\n\n\n<p>Would the following situation ever pose a problem for your software? Your application has more than 10 services and suddenly gets a traffic spike. A large volume of requests has not been completed. <\/p>\n\n\n\n<p>How will you find the culprit fast enough to fix the issue? <\/p>\n\n\n\n<p>If this compels your need for tracing, let&#8217;s explore how Jaeger tracing works from a high-level view: <\/p>\n\n\n<h3 class=\"gb-headline gb-headline-0a58d77c gb-headline-text\" id=\"how-jaeger-tracing-works\"> How Jaeger tracing works<\/h3>\n\n<h4 class=\"gb-headline gb-headline-1ac6d828 gb-headline-text\" id=\"step-1\"><strong>Step 1<\/strong><\/h4>\n\n\n<figure class=\"wp-block-image aligncenter size-thumbnail is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-collects-udp-150x150.png\" alt=\"\" class=\"wp-image-736\" width=\"75\" height=\"75\" srcset=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-collects-udp-150x150.png 150w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-collects-udp-300x300.png 300w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-collects-udp.png 512w\" sizes=\"(max-width: 75px) 100vw, 75px\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Jaeger Agent<\/strong>&nbsp;gathers \u201cspan data\u201d by sampling parts of UDP packets transmitted by microservices<\/p>\n\n\n\n<div class=\"wp-block-stackable-icon stk-block-icon has-text-align-center stk-block stk-c631d35\" data-block-id=\"c631d35\"><style>.stk-c631d35 .stk--svg-wrapper .stk--inner-svg svg:last-child,.stk-c631d35 .stk--svg-wrapper .stk--inner-svg svg:last-child :is(g,path,rect,polygon,ellipse){fill:var(--accent) !important}<\/style><span class=\"stk--svg-wrapper\"><div class=\"stk--inner-svg\"><svg style=\"height:0;width:0\"><defs><linearGradient id=\"linear-gradient-c631d35\" x1=\"0\" x2=\"100%\" y1=\"0\" y2=\"0\"><stop offset=\"0%\" style=\"stop-opacity:1;stop-color:var(--linear-gradient-c-631-d-35-color-1)\"><\/stop><stop offset=\"100%\" style=\"stop-opacity:1;stop-color:var(--linear-gradient-c-631-d-35-color-2)\"><\/stop><\/linearGradient><\/defs><\/svg><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"fas\" data-icon=\"arrow-circle-down\" class=\"svg-inline--fa fa-arrow-circle-down fa-w-16\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" width=\"32\" height=\"32\"><path fill=\"currentColor\" d=\"M504 256c0 137-111 248-248 248S8 393 8 256 119 8 256 8s248 111 248 248zm-143.6-28.9L288 302.6V120c0-13.3-10.7-24-24-24h-16c-13.3 0-24 10.7-24 24v182.6l-72.4-75.5c-9.3-9.7-24.8-9.9-34.3-.4l-10.9 11c-9.4 9.4-9.4 24.6 0 33.9L239 404.3c9.4 9.4 24.6 9.4 33.9 0l132.7-132.7c9.4-9.4 9.4-24.6 0-33.9l-10.9-11c-9.5-9.5-25-9.3-34.3.4z\"><\/path><\/svg><\/div><\/span><\/div>\n\n\n<h4 class=\"gb-headline gb-headline-403ab4a7 gb-headline-text\" id=\"step-2\"><strong>Step 2<\/strong><\/h4>\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-pushes-to-collector.png\" alt=\"\" class=\"wp-image-737\" width=\"64\" height=\"64\" srcset=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-pushes-to-collector.png 512w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-pushes-to-collector-300x300.png 300w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-pushes-to-collector-150x150.png 150w\" sizes=\"(max-width: 64px) 100vw, 64px\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\">Data (service name, start time, duration) gets sent on to the&nbsp;<strong>Collector<\/strong><\/p>\n\n\n\n<div class=\"wp-block-stackable-icon stk-block-icon has-text-align-center stk-block stk-ee5c2a7\" data-block-id=\"ee5c2a7\"><style>.stk-ee5c2a7 .stk--svg-wrapper .stk--inner-svg svg:last-child,.stk-ee5c2a7 .stk--svg-wrapper .stk--inner-svg svg:last-child :is(g,path,rect,polygon,ellipse){fill:var(--accent) !important}<\/style><span class=\"stk--svg-wrapper\"><div class=\"stk--inner-svg\"><svg style=\"height:0;width:0\"><defs><linearGradient id=\"linear-gradient-ee5c2a7\" x1=\"0\" x2=\"100%\" y1=\"0\" y2=\"0\"><stop offset=\"0%\" style=\"stop-opacity:1;stop-color:var(--linear-gradient-ee-5-c-2-a-7-color-1)\"><\/stop><stop offset=\"100%\" style=\"stop-opacity:1;stop-color:var(--linear-gradient-ee-5-c-2-a-7-color-2)\"><\/stop><\/linearGradient><\/defs><\/svg><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"fas\" data-icon=\"arrow-circle-down\" class=\"svg-inline--fa fa-arrow-circle-down fa-w-16\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" width=\"32\" height=\"32\"><path fill=\"currentColor\" d=\"M504 256c0 137-111 248-248 248S8 393 8 256 119 8 256 8s248 111 248 248zm-143.6-28.9L288 302.6V120c0-13.3-10.7-24-24-24h-16c-13.3 0-24 10.7-24 24v182.6l-72.4-75.5c-9.3-9.7-24.8-9.9-34.3-.4l-10.9 11c-9.4 9.4-9.4 24.6 0 33.9L239 404.3c9.4 9.4 24.6 9.4 33.9 0l132.7-132.7c9.4-9.4 9.4-24.6 0-33.9l-10.9-11c-9.5-9.5-25-9.3-34.3.4z\"><\/path><\/svg><\/div><\/span><\/div>\n\n\n<h4 class=\"gb-headline gb-headline-05c37c2d gb-headline-text\" id=\"step-3\"><strong>Step 3<\/strong><\/h4>\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-dashboard.png\" alt=\"\" class=\"wp-image-738\" width=\"64\" height=\"64\" srcset=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-dashboard.png 512w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-dashboard-300x300.png 300w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-dashboard-150x150.png 150w\" sizes=\"(max-width: 64px) 100vw, 64px\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\">Collector sends data to 2 places:&nbsp;<strong>Analytics<\/strong>&nbsp;and&nbsp;<strong>Visual Dashboard<\/strong><\/p>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><em>Et voil\u00e0!<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"480\" height=\"270\" src=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-ui-showing-tracing.png\" alt=\"\" class=\"wp-image-739\" srcset=\"https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-ui-showing-tracing.png 480w, https:\/\/sysmit.com\/cf22\/wp-content\/uploads\/jaeger-ui-showing-tracing-300x169.png 300w\" sizes=\"(max-width: 480px) 100vw, 480px\" \/><figcaption class=\"wp-element-caption\">Above: this is what tracing data looks like in the Jaeger UI <em>(Source: Youtube,&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/youtu.be\/UNqilb9_zwY?t=185\" target=\"_blank\">Jaeger Intro \u2013 Yuri Shkuro<\/a>)<\/em><\/figcaption><\/figure>\n\n\n\n<p>Now let&#8217;s explore how to install Jaeger on a Kubernetes cluster<\/p>\n\n\n<h2 class=\"gb-headline gb-headline-d034e0f3 gb-headline-text\" id=\"how-to-setup-jaeger\">How to setup Jaeger<\/h2>\n\n<h3 class=\"wp-block-heading\" style=\"font-size:clamp(16.834px, 1.052rem + ((1vw - 3.2px) * 0.975), 26px);\" id=\"2-ways-to-install-jaeger-on-kubernetes\">2 ways to install Jaeger on Kubernetes<\/h3>\n\n\n<p>I will assume that you know how Kubernetes clusters are structured in terms of containers, nodes, pods, sidecars, etc. <\/p>\n\n\n\n<p>Jaeger Agent can run on a Kubernetes cluster in two distinct ways: as a daemon or sidecar. Let\u2019s compare both of them.<\/p>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-a68fe39\" data-block-id=\"a68fe39\"><style>.stk-a68fe39{margin-bottom:50px !important}<\/style><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-a68fe39-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-e9c8c0c\" data-v=\"4\" data-block-id=\"e9c8c0c\"><style>.stk-e9c8c0c-container{background-color:rgba(245,243,242,0.5) !important}.stk-e9c8c0c-container:before{background-color:var(--base) !important}<\/style><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-e9c8c0c-container\"><div class=\"stk-block-content stk-inner-blocks stk-e9c8c0c-inner-blocks\">\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-2e67828\" data-block-id=\"2e67828\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-2e67828-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-bdc5e2f\" data-v=\"4\" data-block-id=\"bdc5e2f\"><style>.stk-bdc5e2f-container{background-color:var(--base-3) !important}.stk-bdc5e2f-container:before{background-color:var(--base-3) !important}<\/style><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-bdc5e2f-container stk-hover-parent\"><div class=\"stk-block-content stk-inner-blocks stk-bdc5e2f-inner-blocks\"><h4 class=\"wp-block-heading\" style=\"font-size:clamp(14.642px, 0.915rem + ((1vw - 3.2px) * 0.783), 22px);font-style:normal;font-weight:500\" id=\"setup-jaeger-as-a-daemonset\">Setup Jaeger as a daemonset<\/h4>\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Mechanism:&nbsp;<\/strong>Jaeger Agent runs as a pod and collects data from all other pods within the same node<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Useful for:&nbsp;<\/strong>single tenant or non-production clusters<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Benefits:<\/strong>&nbsp;lower memory overhead, more straightforward setup<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Risk:&nbsp;<\/strong>security risk if deployed on a multi-tenant cluster<\/p>\n\n\n\n<p><a href=\"https:\/\/www.digitalocean.com\/community\/tutorials\/how-to-implement-distributed-tracing-with-jaeger-on-kubernetes\">LEARN BY DOING: simple Jaeger setup tutorial<\/a> via Digital Ocean<\/p>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-de9f95d\" data-block-id=\"de9f95d\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-de9f95d-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-96a76dc\" data-v=\"4\" data-block-id=\"96a76dc\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-96a76dc-container stk--no-background stk--no-padding\"><div class=\"stk-block-content stk-inner-blocks stk-96a76dc-inner-blocks\"><\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-c1b505b\" data-block-id=\"c1b505b\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-c1b505b-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-9ef46ef\" data-v=\"4\" data-block-id=\"9ef46ef\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-9ef46ef-container stk--no-background stk--no-padding\"><div class=\"stk-block-content stk-inner-blocks stk-9ef46ef-inner-blocks\">\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-f58dab1\" data-block-id=\"f58dab1\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-f58dab1-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-af8f29e\" data-v=\"4\" data-block-id=\"af8f29e\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-af8f29e-container stk-hover-parent\"><div class=\"stk-block-content stk-inner-blocks stk-af8f29e-inner-blocks\"><h4 class=\"wp-block-heading\" style=\"font-size:clamp(14.642px, 0.915rem + ((1vw - 3.2px) * 0.783), 22px);font-style:normal;font-weight:500\" id=\"setup-jaeger-as-a-sidecar\">Setup Jaeger as a sidecar<\/h4>\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Mechanism:&nbsp;<\/strong>Jaeger Agent runs as a container alongside the service container within every pod<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Useful for:&nbsp;<\/strong>multi-tenant clusters, public cloud clusters<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Benefits:<\/strong>&nbsp;granular control, higher security potential<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Risk:&nbsp;<\/strong>more DevOps supervision required<\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/jaegertracing\/jaeger-kubernetes#deploying-the-agent-as-sidecar\">LEARN BY DOING: deploy Jaeger as a sidecar<\/a> via Jaeger&#8217;s Github<\/p>\n<\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n\n\n\n\n\n<p>Remember from earlier that Jaeger samples parts of UDP packets transmitted by services?<\/p>\n\n\n\n<p>There are 2 sampling methods for sampling UDP packets: heads-based sampling and tails-based sampling. Each has its benefits and downsides. Let\u2019s explore:<\/p>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-9a740ee\" data-block-id=\"9a740ee\"><style>.stk-9a740ee{margin-bottom:50px !important}<\/style><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-9a740ee-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-aa986d2\" data-v=\"4\" data-block-id=\"aa986d2\"><style>.stk-aa986d2-container{background-color:rgba(245,243,242,0.5) !important}.stk-aa986d2-container:before{background-color:var(--base) !important}<\/style><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-aa986d2-container\"><div class=\"stk-block-content stk-inner-blocks stk-aa986d2-inner-blocks\">\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-ff06d5f\" data-block-id=\"ff06d5f\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-ff06d5f-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-3c5d25d\" data-v=\"4\" data-block-id=\"3c5d25d\"><style>.stk-3c5d25d-container{background-color:var(--base-3) !important}.stk-3c5d25d-container:before{background-color:var(--base-3) !important}<\/style><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-3c5d25d-container stk-hover-parent\"><div class=\"stk-block-content stk-inner-blocks stk-3c5d25d-inner-blocks\"><h4 class=\"wp-block-heading\" style=\"font-size:clamp(14.642px, 0.915rem + ((1vw - 3.2px) * 0.783), 22px);font-style:normal;font-weight:500\" id=\"headsbased-sampling\">Heads-based sampling<\/h4>\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Also known as<\/strong>&nbsp;upfront sampling<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Mechanism:&nbsp;<\/strong>sampling decision is made before request completion<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Useful for:&nbsp;<\/strong>high-throughput use cases, looking at aggregated data<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Benefits:<\/strong>&nbsp;cheaper sampling method \u2013 lower network and storage overhead<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Risk:&nbsp;<\/strong>potential to miss outlier requests due to less than 100% sampling<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Work required:<\/strong>&nbsp;easy setup, supported by Jaeger SDKs<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Configuration notes:&nbsp;<\/strong>sampling based on flip-of-coin or until a certain rate is achieved<\/p>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-62502e2\" data-block-id=\"62502e2\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-62502e2-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-ef701a3\" data-v=\"4\" data-block-id=\"ef701a3\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-ef701a3-container stk--no-background stk--no-padding\"><div class=\"stk-block-content stk-inner-blocks stk-ef701a3-inner-blocks\"><\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-b718c43\" data-block-id=\"b718c43\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-b718c43-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-e3e088b\" data-v=\"4\" data-block-id=\"e3e088b\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-e3e088b-container stk--no-background stk--no-padding\"><div class=\"stk-block-content stk-inner-blocks stk-e3e088b-inner-blocks\">\n<div class=\"wp-block-stackable-columns stk-block-columns stk-block stk-aacf9a9\" data-block-id=\"aacf9a9\"><div class=\"stk-row stk-inner-blocks stk-block-content stk-content-align stk-aacf9a9-column\">\n<div class=\"wp-block-stackable-column stk-block-column stk-column stk-block stk-eb00dd1\" data-v=\"4\" data-block-id=\"eb00dd1\"><div class=\"stk-column-wrapper stk-block-column__content stk-container stk-eb00dd1-container stk-hover-parent\"><div class=\"stk-block-content stk-inner-blocks stk-eb00dd1-inner-blocks\"><h4 class=\"wp-block-heading\" style=\"font-size:clamp(14.642px, 0.915rem + ((1vw - 3.2px) * 0.783), 22px);font-style:normal;font-weight:500\" id=\"tailsbased-sampling\">Tails-based sampling<\/h4>\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Also known as&nbsp;<\/strong>response sampling<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Mechanism:&nbsp;<\/strong>sampling decision is made after the request has been completed<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Useful for:&nbsp;<\/strong>catching anomalies in latency, failed requests<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Benefits:<\/strong>&nbsp;more intelligent approach to looking at request data<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Risk:&nbsp;<\/strong>temporary storage for all traces \u2013 more infra overhead, a single node only<\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Work required:<\/strong>&nbsp;extra work \u2013 connect to a tool that supports tail-based sampling&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/web.archive.org\/web\/20210421143122\/https:\/\/lightstep.com\/jaeger\/\" target=\"_blank\">like Lightstep<\/a><\/p>\n\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\"><strong>Config notes:&nbsp;<\/strong>sampling based on latency criteria and tags<\/p>\n<\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n<\/div><\/div><\/div>\n<\/div><\/div>\n\n\n\n<p>Now that you&#8217;ve picked your sampling method, you will also need to consider that Jaeger&#8217;s collector has a finite data capacity. <\/p>\n\n\n<h3 class=\"gb-headline gb-headline-a99362e2 gb-headline-text\" id=\"prevent-jaegers-collector-from-getting-clogged\">Prevent Jaeger&#8217;s collector from getting clogged<\/h3>\n\n\n<p>Jaeger\u2019s collector holds data temporarily before it writes onto a database. The visual dashboard then queries this database. But the collector can get clogged if the database can\u2019t write fast enough during high-traffic situations.<\/p>\n\n\n<h4 class=\"gb-headline gb-headline-8b0cb217 gb-headline-text\" id=\"problem\"><strong>Problem:<\/strong> <\/h4>\n\n\n<ul style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\">\n<li>Collector\u2019s temporary storage model becomes problematic during traffic spikes<\/li>\n\n\n\n<li>Some data gets dropped so the collector can stay afloat from the flood of incoming request data<\/li>\n\n\n\n<li>Your tracing may look patchy in areas because of the gaps in sampling data<\/li>\n\n\n\n<li><strong>Risk of missing failed or problematic requests<\/strong>&nbsp;if they were in the sampling that gets dropped<\/li>\n<\/ul>\n\n\n<h4 class=\"gb-headline gb-headline-e44d03c3 gb-headline-text\" id=\"solution\"><strong>Solution:<\/strong><\/h4>\n\n\n<ul style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\">\n<li>Consider&nbsp;asynchronous span ingestion technique to solve this problem<\/li>\n\n\n\n<li>This means adding a few components between your collector and database:\n<ol>\n<li>Apache Kafka \u2013&nbsp;<em>real-time<\/em>&nbsp;data streaming at scale<\/li>\n\n\n\n<li>Apache Flink \u2013 processes Kafka data <em>asynchronously<\/em><\/li>\n\n\n\n<li>2 jaeger components \u2013 jaeger-ingester and jaeger-indexer \u2013 push Flink output to storage<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n\n\n\n<p>Once these components are in place, the collector will be less likely to get overloaded and dump data. <\/p>\n\n\n<h4 class=\"gb-headline gb-headline-b45e5802 gb-headline-text\" id=\"implementation-reading\"><strong>Implementation reading:<\/strong><\/h4>\n\n\n<p>These links \u2013 access them in order \u2013 might help you get started with your implementation:<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><a href=\"https:\/\/web.archive.org\/web\/20210421143122\/https:\/\/youtu.be\/UNqilb9_zwY?t=1721\" target=\"_blank\" rel=\"noreferrer noopener\">Youtube \u2013 Jaeger straight-to-DB vs asynch write method<\/a><\/p>\n\n\n\n<p class=\"has-medium-font-size\"><a href=\"https:\/\/web.archive.org\/web\/20210421143122\/https:\/\/www.kubenuts.com\/jaeger-tracing-kubernetes\/\">Youtube \u2013 Apache Kafka videos by Confluent<\/a><\/p>\n\n\n\n<p class=\"has-medium-font-size\"><a href=\"https:\/\/web.archive.org\/web\/20210421143122\/https:\/\/www.kubenuts.com\/jaeger-tracing-kubernetes\/\" target=\"_blank\" rel=\"noreferrer noopener\">Practical overview (with example) of Apache Flink<\/a><\/p>\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:clamp(16.834px, 1.052rem + ((1vw - 3.2px) * 0.975), 26px);\" id=\"wrapping-up\">Wrapping up<\/h2>\n\n\n<p style=\"font-size:clamp(15.747px, 0.984rem + ((1vw - 3.2px) * 0.878), 24px);\">This concludes our article on Jaeger and the promise it holds for distributed tracing of microservices, as well as the wider observability apparatus. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, I will share how tracing and more specifically Jaeger tracing can fit into your wider software observability strategy. Before we get into tracing, let&#8217;s define observability. What is observability? Observability is a comprehensive means of gaining data on how software services perform in production. This data gives you a picture of the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[60,32],"tags":[30,31],"_links":{"self":[{"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/posts\/729"}],"collection":[{"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/comments?post=729"}],"version-history":[{"count":21,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/posts\/729\/revisions"}],"predecessor-version":[{"id":5759,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/posts\/729\/revisions\/5759"}],"wp:attachment":[{"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/media?parent=729"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/categories?post=729"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sysmit.com\/cf22\/wp-json\/wp\/v2\/tags?post=729"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}