The Subtle Differences Between Logs, Metrics, and Audits

“We should log that.” 

Developers often throw this phrase around, using it as a catch-all for any type of error or information storage. Although logs are an invaluable tool, they aren’t the only option for recording and tracking events.

To understand the differences between logs, metrics, and audits, let’s take a look at their characteristics and ideal use cases.

What Are Logs?

Logs are developer-focused records used to track events, errors, and other information within a system. Logs help with API integration, troubleshooting, and generally monitoring the health status of the application.

There are several different types of logs, with each level indicating a different level of urgency. These levels have different names depending on which logging framework you’re using, but the most common levels are:

1. Critical, also known as fatal, is the highest log level. Critical is used when an application won’t boot at all or in situations where there is guaranteed data corruption or loss.

2. Error is used for things like unhandled exceptions and other issues that impact the operation but not necessarily the application or service.

3. Warning is used for things that are potential problems. In isolation, a single warning likely wouldn’t cause issues, but dozens of warning-level events could be an indicator of a more serious error.

4. Information is typically used as a summary of debug requests. Usually, you have one information log per request.

5. Debug is used to track the steps you took when you made a request and record diagnostically helpful details.

When to Use Logs

The key takeaway is that logs are developer-focused, and other functions likely won’t ever need to access them. Another thing to note is that, because of how they are stored, you can’t be 100% certain that you’ll be able to access your logs in the event of a server crash. While not ideal, logs don’t contain sensitive data, and they aren’t referenced for legal or compliance reasons (more on this later).

What Are Metrics?

While logs track events and actions as they occur, metrics measure a system’s performance at a certain point in time — or at fixed intervals. A metric is a number, usually a counter or a gauge, that the developer decides is important to the observation and maintenance of the system. There are many different types of metrics, but we’ll focus on two types: application metrics and business metrics.

Application metrics are specific technical benchmarks that help you gauge system health — things like queue depth and CPU, memory, and network response times fall into this category.

Business metrics, on the other hand, help you understand less technical things, such as how many times users pressed a certain button or downloaded a certain document. 

When to Use Metrics

Use metrics to measure specific performance indicators at specific times. Unlike logs, which are used when an event occurs, metrics are typically collected on set time internals. For example, if you want to keep an eye on queue depth, you could create a metric to measure queue depth at specific times each day. 

Like logs, metrics can be susceptible to data loss in the event of a crash, depending on your storage methods. But because you may want to monitor some metrics over a longer period of time, there are options for securely storing metric data over the long term. 

What Are Audits?

In contrast to logs and metrics, which are mainly focused on events and actions occurring within a software system, audits capture information about which users are performing actions and when. Audits typically serve legal, compliance, and/or traceability purposes.

Audits are time-stamped records with details about the user and any actions they take, including adding or editing content, removing content, accessing data, and performing transactions.

When to Use Audits

In systems with authorization hierarchies, audits help you track what different users are doing at any given time. In some industries, certain laws and governing agencies require audits for compliance reasons. Insurance systems, for example, record audit information in compliance with HIPAA regulations. 

Due to the sensitive nature of audits, any data loss is unacceptable. For compliance reasons, you would not want a health insurance system that would both allow a user to edit content and fail to record audit data about that user, for example. For this reason, audit logging systems are designed to fail the entire request if audit data can’t be stored for whatever reason.

This blog originated from a Lean BYTES presentation given by Lean TECHniques Director of Engineering, Scott Sauber. Lean BYTES are short, 16-minute webinars where you can get the quick hits on a variety of development and IT-related topics. See what’s coming up and sign up to join our next Lean BYTES presentation.