Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce task local logger #315

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

FranzBusch
Copy link
Member

Motivation

Structured logging and structured concurrency are a match made in heaven. Using task locals for loggers enables easier passing of metadata for log messages and removes the cumbersome passing around of loggers.

Modification

This PR adds a proposal, the implementation and benchmarks for a task local logger.

Result

Using a task local for logging makes it incredible ergonomic to carry around metadata. To be transparent the benchmarks show that this is around 20x slower than a bare no-op log. A no-op log takes around 2ns whereas a task local logger no-op log takes around 40ns. 95% of that time is spent in accessing the task local since that is missing some @inlinable in the runtime and goes through the C interface. Instruction wise it is 20 vs 290 instructions.

Now I still propose that we go forward with this since the ergonomics far outweigh the performance initially. Furthermore, this acts as a great forcing function for performance improvements in the runtime since logging is often done in hot paths.

NoOpLogger
╒══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                   │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Instructions *           │      20 │      20 │      20 │      20 │      20 │      20 │      20 │     100 │
├──────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *         │       0 │       0 │       0 │       0 │       0 │       0 │       0 │     100 │
├──────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ns) * │       2 │       2 │       2 │       2 │       2 │       2 │       2 │     100 │
╘══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

NoOpLogger task local
╒══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                   │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Instructions *           │     289 │     289 │     289 │     289 │     289 │     290 │     290 │     100 │
├──────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *         │       0 │       0 │       0 │       0 │       0 │       0 │       0 │     100 │
├──────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ns) * │      33 │      39 │      39 │      40 │      40 │      42 │      42 │     100 │
╘══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

@FranzBusch FranzBusch force-pushed the fb-task-local-logger branch 2 times, most recently from 13be449 to 0e33370 Compare August 8, 2024 12:55
Copy link
Member

@ktoso ktoso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m currently offline for a week and would really want to have the time to dive deep on this. I don’t think it’s an obvious great idea and needs some proper debate and plans before merging. I’m especially concerned what this means for the library ecosystem and will have to think this through more.

Requesting changes to mark this intent. Thanks for your patience

@@ -43,6 +43,9 @@ import WASILibc
/// logger.info("Hello World!")
/// ```
public struct Logger {
@available(macOS 10.15, *)
@TaskLocal
public static var logger: Logger = Logger(label: "NoOp", SwiftLogNoOpLogHandler())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest a name like current or inherited so that it reads more naturally than Logger.logger (which sounds like a factory method that creates a brand new logger instead)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with this completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! Logger.current or Logger.inherited both sound good to me but open to other stuff too

Copy link
Member

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's great but I know @ktoso would like to discuss. So I'm approving but only because I know that @ktoso has already requested changesdiscussion

@@ -43,6 +43,9 @@ import WASILibc
/// logger.info("Hello World!")
/// ```
public struct Logger {
@available(macOS 10.15, *)
@TaskLocal
public static var logger: Logger = Logger(label: "NoOp", SwiftLogNoOpLogHandler())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! Logger.current or Logger.inherited both sound good to me but open to other stuff too

Copy link
Contributor

@0xTim 0xTim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM I have concerns about it being enforced and not opt-in (though I guess one could pin to the old version). It's good to force performance improvements in the runtime but with such a performance hit being shown I wonder if it's worth offering a way of not using it - ignore me, it's entirely opt in if you just create an instance of the logger and pass that around 🤦 don't review code at 3AM 😆


var logger = Logger.logger
logger[metadataKey: "MetadataKey1"] = "Value1"
Logger.$logger.withValue(logger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this syntax required for passing metadata through to child tasks? I thought that was automatic. E.g.:

logger[metadataKey: "MetadataKey1"] = "Value1"
try await someFunction()


func someFunction() async throws {
  // `MetadataKey1` is available here
}

Definitely agree with renaming logger here - the API doesn't come across great with the double logger in this example

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logger is a value type, so the modified copy needs to be written back into the task local to be propagated to child tasks.

# Motivation

Structured logging and structured concurrency are a match made in heaven. Using task locals for loggers enables easier passing of metadata for log messages and removes the cumbersome passing around of loggers.

# Modification

This PR adds a proposal, the implementation and benchmarks for a task local logger.

# Result

Using a task local for logging makes it incredible ergonomic to carry around metadata. To be transparent the benchmarks show that this is around 20x slower than a bare no-op log. A no-op log takes around 2ns whereas a task local logger no-op log takes around 40ns. 95% of that time is spent in accessing the task local since that is missing some `@inlinable` in the runtime and goes through the C interface. Instruction wise it is 20 vs 290 instructions.

Now I still propose that we go forward with this since the ergonomics far outweigh the performance initially. Furthermore, this acts as a great forcing function for performance improvements in the runtime since logging is often done in hot paths.
@FranzBusch FranzBusch force-pushed the fb-task-local-logger branch from 0e33370 to 565e5fc Compare August 15, 2024 09:33
@ktoso ktoso added the kind/proposal Proposals for review. label Aug 20, 2024
Logger.logger.info("First log")

var logger = Logger.logger
logger[metadataKey: "MetadataKey1"] = "Value1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Logger.label also become publicly settable? If we have a single Logger in the task local, and there's no way to create a copy of the Logger (with the same log handler, level, and metadata, but different label), is the label even meaningful at that point, if ~all log events from the same task hierarchy will have the same label?

Copy link
Member

@weissi weissi Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label was never really meaningful. It's only there because it was easy to add and it enables one feature: If you have one process that runs two (or more) services, then you can use the label to distinguish. Example problem that we thought of: Let's say you use gRPC in both your services (let's call them "main service" and "admin endpoint") and now you get a log from gRPC, you probably care more about it if it's the main service (and much less if it's the admin endpoint).

For example you could query label=com.me.my-service source=GRPC level>=WARNING and you wouldn't get admin endpoint GRPC warnings.

Said all that, you can of course do that with metadata too:

var loggerBase = Logger(label: "everything")

var adminEnpointLogger = loggerBase
adminEndpointLogger[metadataKey: "service"] = "admin"

var mainServiceLogger = loggerBase
mainServiceLogger[metadataKey: "service" = "main"

and then query service=main source=GRPC level>=WARNING.

Copy link
Member

@weissi weissi Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's a non-overridable "base label" or whatever. Conveniently, it also allowed a transition from Logger-per-class setups that older logging frameworks (and some other ecosystems) use...

I'd consider label a legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, but it's still part of the non-deprecated API. This proposal creates/worsens the problem, so I think it should address it, or at the very least acknowledge it and recommend a workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@czechboy0 Oh I see, are you saying that with this proposal it's impossible for the user to wholesale switch from one Logger to another? If so then I think this needs addressing indeed.

More importantly than label, we also have the logging backend. And I do sometimes hand a logger with a different backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exactly like it is today and yes: It is helpful, your log aggregation system likely aggregates stuff from all of your services. So the label is essentially your service name.

And as I said before, the pod name isn't a good substitute for that because you may run two mostly separate services in one process.

So essentially label == service name, regardless if the service has its own pod or if it's in the same process as a few others. I legitimately use it every day.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the label is essentially your service name

Yeah I could get behind that. Many libraries don't do that, and create loggers and put their subcomponent names there, but maybe with the task-local logger, we can start moving in the direction where the label is not something libraries even think about (only apps).

The log events already get fileID, line number, and function name - that is hopefully enough, with user-provided metadata, to track where a log event is coming from. I wonder if we should also propose a #module macro and include that in log events, possibly replacing what some people use labels for today.

Copy link
Member

@weissi weissi Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I could get behind that. Many libraries don't do that, and create loggers and put their subcomponent names there, but maybe with the task-local logger, we can start moving in the direction where the label is not something libraries even think about (only apps).

  1. Yes. And part of swift-log's design was squaring the circle such that it's not incompatible with existing logging modes (which means that label is used in various different ways -- but it was primarily meant for what we're discussing here: The service name/some root identifier).
  2. You phrased my thinking really well: The task-local logger should hopefully be enough of a carrot that we can all reap the benefits of structured logging without necessarily having to pass down a logger manually which feels alien to some devs and breaks APIs in some cases. Hence: win/win IMHO :)

The log events already get fileID, line number, and function name

Yes, and also crucially source which is your module name which is currently derived from #fileID which guarantees that it starts with ModuleName/. But of course, it should just be source: String = #module.

And yes please to #module, I had pitched it before over here: https://forums.swift.org/t/pitch-introduce-module-to-get-the-current-module-name/45806

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also crucially source which is your module name.

Completely missed that. So yeah I agree we can just update docs on label and steer people in this direction, once the task-local logger lands.

Copy link
Member

@weissi weissi Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also crucially source which is your module name.
Completely missed that. So yeah I agree we can just update docs on label and steer people in this direction, once the task-local logger lands.

Sounds like a plan to me. And yes, the docs have always been spotty which we should fix.

@slashmo
Copy link
Contributor

slashmo commented Mar 17, 2025

Using task locals for loggers enables easier passing of metadata for log messages and removes the cumbersome passing around of loggers.

What does a task-local logger enable that's not yet possible using metadata providers? For example, a metadata provider can already be used to look up a task-local and turn it into metadata transparently.

Removes the cumbersome passing around of loggers.

What other use cases are there for passing around loggers? I believe I saw it being used to pass around the minimum log level.

@FranzBusch
Copy link
Member Author

What does a task-local logger enable that's not yet possible using metadata providers? For example, a metadata provider can already be used to look up a task-local and turn it into metadata transparently.

From experience with metadata providers now I think they are just too unergonomic in both discoverability but also usability. As an application author to correctly setup metadata providers you need to both know what library (often transitive) in your dependencies defines a metadata provider and then remember to correctly set it up.

This IMO goes against the goals that we want to have with observability in Swift which is "works out of the box". If I recall correctly a big reason for introducing metadata providers was that we wanted our tracing backends to add logging metadata but since they don't have the logger available we had to invert control and let the user stitch the two pieces together.

Nowdays, with package traits and the potential task local logger here I think we can provide a way better experience where tracing backends can do this automatically while offering to disable it when needed. This brings us back to the "works out of the box" that we aim for.

What other use cases are there for passing around loggers? I believe I saw it being used to pass around the minimum log level.

Log levels really should only be set by the application that creates the logger. The big problem that I try to solve here is that the structured call tree works great but our loggers are not automatically included in it. So what I see happening all over the place is that either the logger has almost no metadata or the wrong logger is used since it is passed in through an init. I personally strongly believe that we should offer a task local logger since it is so essential for our debugging capabilities to have the right metadata attached to it.

@weissi
Copy link
Member

weissi commented Mar 18, 2025

What other use cases are there for passing around loggers?
Log levels really should only be set by the application that creates the logger.

Disagree. I very frequently want to lower (e.g. to debug) the level of a particular tree when I know the problem is happening in a certain corner. Occasionally I also raise the level (e.g. to warning) when a library keeps spewing out unimportant stuff in info or notice.

The other big use case is to provide a Logger with an alternative backend for sub operations. One thing I've done successfully is this:

func someFunctionality(logger realLogger: Logger) async throws {
    [...]
    let inMemoryLogAccumulator = InMemoryLogAccumulator()
    var inMemoryLogger = Logger(label: "foo", factory: { _ in inMemoryLogAccumulator })
    inMemoryLogger.logLevel = .trace
    do {
        try await codeThatHasAnnoyingBugs(logger: inMemoryLogger)
    } catch SomeSpecificWeirdError {
        // A reproduction of this one weird error that I can't reproduce outside of prod happened, let's emit all the logs
        // into our real logging system.
        var realLogger = realLogger
        realLogger.logLevel = .trace
        realLogger[metadataKey: "weird-error-reproduction"] = "true"
        inMemoryLogAccumulator.emitAllLogs(info: realLogger)
        throw error
    }
   [...]
}

That way I get all the .trace logs from codeThatHasAnnoyingBugs() but only if we hit SomeSpecificWeirdError. In all other cases, it's silenced. This is really powerful, exactly like tail-sampling, implemented super easily.

And I get to select it easily with weird-error-reproduction = "true" GROUP BY request-id or similar (assuming that something else emitted request-id into the metadata).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/proposal Proposals for review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants