Observability vs. monitoring
The term “monitoring” has recently taken a beating, with vendors scrambling to promote themselves as observability providers rather than monitoring providers. However, we believe that this is a false choice; using dashboards and alerts to monitor select critical data is just one aspect of a comprehensive observability plan. However, Observability necessitates a lot more.
Monitoring, as explained in What is Observability? Alerts you when something is incorrect. It would be beneficial if you first decided which signals you wanted to monitor (your “known unknowns”). On the other hand, Observability is the ability to question why provides you the freedom to investigate “unknown unknowns” on the go.
Because forecasting all failure modes in today’s world of sophisticated, distributed systems based on hundreds of microservices is unrealistic, you need the flexibility to ask any query of your data to address complicated challenges.
What is cardinality?
Monitoring critical aggregate signals (i.e., metrics) is vital for keeping your stack functioning smoothly, as previously indicated. These indicators give crucial information on the general health and performance of various systems. They do, however, have trade-offs, which we’ll address momentarily.
But first, let’s clarify two essential terms:
- CustomerID, URL, Region, and Date are data dimensions that act as ‘keys’ in key-value pairs and tie to essential issues for your business, such as “customers,” “products,” “stores,” and “time.”
- Cardinality (the intersection of ‘keys’ and ‘values’) is the number of distinct values within a data dimension. CustomerID (unique) represents high-cardinality data, whereas Region represents low-cardinality data.
Low-cardinality data, frequently represented using dimensional metric data, has drawbacks. Count, sum, min, max, average, and latest are examples of metrics. As each function name implies, these data points are summary data on specific events or sample transactions. Here are several examples:
- CPU usage maximum per minute
- 504 mistakes per second are the average number of transactions per hour.
Knowing how many 504 errors you have doesn’t tell you why you have them. To get around this limitation, you can add tags (dimensions) to the data, each of which increases the primary key of the data (thus, making your information more useful for troubleshooting).
What limits cardinality
Naturally, everyone would like their monitoring and observability tools to record every transaction throughout their stack, each with an infinite number of key: value pairs, allowing users to correlate failures and root causes. But there’s a catch: The majority of monitoring solutions cannot handle high-cardinality data.
The volume of data grows in lockstep with cardinality, necessitating a massive amount of computing and storage to process it swiftly and efficiently. Unfortunately, most tools aren’t powerful enough to scale to fulfill these requirements. Surprisingly, some technologies don’t even allow you to run ad hoc queries on your data.
Because they can’t scale at query time, one SaaS monitoring vendor, for example, asks you to pre-identify the data you wish to facet at intake time. This architecture not only slows their ingestion workflow (and your time to figure out why), but it also severely limits your capacity to query your data on the fly.
Due to the limits indicated above, you force to choose one or more of the following options:
- There are strict limitations on how many dimensions you can add.
- Data that has been sampled or aggregated has a shorter storage time.
- Penalties for violating a (much too low) stringent limit on the number of tags that can use
You’re left with simply monitoring and the capacity to spot an issue, but not the data granularity to pinpoint and correct the fundamental cause.
What cardinality requires
Low-cardinality data and monitoring can aid in the detection of potential problems. High-cardinality data is required to determine which customers (or hosts, App IDs, processes, and SQL queries) are linked to a problem. High-cardinality data gives you the granularity and precision you need to isolate and identify the core cause of an issue, allowing you to pinpoint precisely where and why it happened.
Before the term observability entered the industry lexicon, the platform focused on high-cardinality data, allowing discrete, detailed records such as application transactions and browser page visits. Consider how clients might place orders on a website. For each transaction, one data model allows you to record the userID, the dollar amount, the number of things purchased, the processing time, and any other necessary parameters.
As your firm grows, add them for those transactions and provide new trades with varied attributes. May investigate individual transactions with such delicate data, which goes beyond the capability of low-cardinality, aggregated measures. You might wish to inquire about:
- Show me all users who had a 504 error between 9:30 and 10:30 a.m. today.
- Were they dispersed over all hosts or just a few?
- What services did they have access to?
- What kind of database calls were made?
- Were they looking to buy something specific?
Ask the unknown unknowns
Because each transaction is a special event, you may examine the distributions of any attribute, including the correlation between them. If your pre-built dashboards don’t solve your current problem, you won’t leave in the dark. You’ll have to ask questions ahead of time if you don’t specify everything.
By consolidating your telemetry data onto a single platform, you comprehensively view your whole technical stack. You can traverse all of your telemetry data across standard dimensions when evaluating performance issues.
We designed the world’s most sophisticated telemetry data platform to provide you with these unique features. It’s made to deal with data with a lot of cardinalities. The Telemetry Data Platform is a multi-tenant cloud service that runs queries on thousands of CPU cores simultaneously, processing billions of records from petabytes of data in seconds and providing you with results in milliseconds.
There’s also no need to worry about tags causing your cost to skyrocket. Our industry-leading $0.25 per GB of data ingested provides you with certainty and the assurance that your data will be available when needed.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enhancing Accountability and Cost Estimation in the Financial Sector with Enteros
- 27 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing E-commerce Operations with Enteros: Leveraging Enterprise Agreements and AWS Cloud Resources for Maximum Efficiency
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Revolutionizing Healthcare IT: Leveraging Enteros, FinOps, and DevOps Tools for Superior Database Software Management
- 21 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Real Estate Operations with Enteros: Harnessing Azure Resource Groups and Advanced Database Software
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…