What We Do in the Shadows: The Serverless Tracing Specter
It’s a dream to be serverless. To genuinely be “serverless,” there must be a clear distinction between the technical concerns that matter to your business and the arbitrary, tedious, un-fun platform hosting fears you can delegate to a vendor. Actual, real-world “serverless” products only come close to realizing this fantasy, leaving you to manually control things that, in an ideal world, would be automated, handled by a vendor, and perform flawlessly for every user. Users’ troubles with serverless are frequently due to design flaws, such as security configuration, cold starts, and observability. They’re the kinds of issues you’d expect to encounter when the vendor is in charge of most, but not all, aspects of executing code. This paper will concentrate on the observability challenge of serverless tracing in particular.
The architecture will naturally become event-based as you separate microservices into distinct serverless steps. I don’t have room to detail the benefits of event-based design in this article, but suffice it to say that most serverless developers will do so. The issue of observability arises in an architecture where real-world actions generate one or more events that are handled by several services. When many benefits are addressing an event, sometimes extremely quickly and asynchronously, observability—the goal of delivering information on your technology that is easy to read and actionable—becomes tough.
Let’s start with a hypothetical situation. A problem has been reported on an e-commerce site: When a consumer enters a coupon code, it may be invalid when it is supposed to be valid. Because the issue isn’t constant, straight end-to-end testing can’t simply reproduce. It necessitates an examination of real-world performance. There are thousands of logged events when you look at the dashboards for the various co mponents in your stack, but tracing becomes increasingly complicated.
So what is the connection here between the event and the others on our calendar?
It can be challenging to correlate events by time code or event properties (such as customer ID) because asynchronous processing causes time codes to differ. Some systems, such as queueing services or API gateways, don’t provide precise logging mechanisms. During an active incident, the question of which database event generated a specific compute event might cause operations teams to become distracted.
The more pressing question is: “How does this event relate to other occurrences on our platform?” Standard cloud introspection tools will provide answers to specific seemingly relevant questions, such as:
- Which service can connect to other services?
- What services were logged as having errors?
- What is each service’s throughput rate?
You can often observe telling trends from these questions—and whatever metrics are available on the service components, such as the customer above ID—for example, when one service breaks, it reduces traffic on another to zero. However, these techniques frequently fall short when looking at a specific pattern or a single incident.
The Serverless Tracing Specter
Structured logs are an excellent way to plan for observability.
When working with massive amounts of data at high speeds, it’s common to refer to it as “drinking from the firehose.” It can feel like sipping from a fire hose when dealing with log data from several cloud services. It can be a pain trying to parse your logs for observability. And if your records are unstructured, you’ll waste time trying to filter them with regular expressions and still getting false positives.
AWS strongly recommends structured logs. In general, JSON is constructed so that it’s simple to make queries with numerous parameters. The search for relevant data is substantially simplified as a result of this.
Even if you use structured logs, you’ll run into issues with services that don’t allow you to log in. For example, API gateways and queueing services don’t always allow you to organize logs neatly.
AWS X-Ray integration is built for observability.
Only your cloud provider will have actual insight into what’s happening between the components of your stack at some point. The cloud provider should understand the entire system well because they manage and administer an internal routing layer. AWS X-Ray presents a map of your application’s underlying components and provides an end-to-end view of requests as they flow through your application.
AWS X-Ray, for example, provides that level of visibility, but it has restrictions. While X-Ray can give detailed insight into the path of a single transaction, you must combine it with observability technologies that emphasize the issues you’re most concerned about.
After all, an X-Ray provides information on a single request and its links, but not about how requests perform as a group.
The unfinished business of the specter: The beginning of observability is in the code.
The promise of serverless is that the vendor will handle all of the tedious, needless work. It’s also enticing to experiment with technologies that promise flawless insight into our stack without more effort.
The reality is that, while observability is difficult, it is not a waste of time. Only developers understand entirely what is going on inside their code. While the tool’s agent running on your serverless code might provide valuable automatic insights, there will always be details whose significance is limited to your team. As a result, the individuals designing the code must prepare for observability as they write it.
Enteros
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Revolutionizing Healthcare IT: Leveraging Enteros, FinOps, and DevOps Tools for Superior Database Software Management
- 21 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Real Estate Operations with Enteros: Harnessing Azure Resource Groups and Advanced Database Software
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Revolutionizing Real Estate: Enhancing Database Performance and Cost Efficiency with Enteros and Cloud FinOps
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros in Education: Leveraging AIOps for Advanced Anomaly Management and Optimized Learning Environments
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…