-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rumqttd: Changes remote start span from error to info. #782
base: main
Are you sure you want to change the base?
rumqttd: Changes remote start span from error to info. #782
Conversation
it was set to see #510 |
This also would bring in the same span type as websocket remote. It uses info, this uses error. |
can you elaborate more? didn't get it 😅 |
19a84e3
to
9666907
Compare
Sure. If you look at the Rust ecosystem logging, after the dependency chain is done, you'll usually end with
Log is nice because each of those can have their own impl and format as needed. Log is also nice because you get all the other logs from applications that use it. However, this can also be a downside when people use logging differently. The case of rumqttd and us, it is great that we don't have to fork the broker and put in specialty logs because rumqttd uses log, but because rumqttd uses that error span, that shows up as error type on our end, because we log error types as error types. This causes a problem in the network logger because alarms and dashboards are built around it. If for example, 100 clients connect at the same time, this produces 100 "error" logs of them connecting because of that error span. In alarming, if we have an alarm that is triggered when x error types reach a certain trreshold, this triggers that alarm, someone has to investigate, etc... |
wdym by error type ? btw, rumqtt/rumqttd/src/router/routing.rs Line 236 in 8c04339
routing.rs so why only this error span trigger it?
maybe some issue with note: rumqttd uses tracing instead of log 😄 |
The log type comes in as "error", not info, debug, trace, etc...
Maybe it is just when instrument is called with it, this is the only spot it happens.
Yes, but if you run the dependency chain all the way down, you are using log :): https://github.com/tokio-rs/tracing/blob/master/tracing/Cargo.toml#L35 |
Might be issue with instrument then? 👀 We should investigate it further to figure out actual cause! |
Not particularly a problem with instrumented. The problem is saying that events emitted inside this span are to be treated as errors, which I do not think is ultimately what is wanted. When a thing is instrumented, anytime a poll or drop occurs, the span is entered: impl<T: Future> Future for Instrumented<T> {
type Output = T::Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let (span, inner) = self.project().span_and_inner_pin_mut();
let _enter = span.enter();
inner.poll(cx)
}
} In our case, instrumenting begins, new connection happens, thing is polled, event is emitted, that event is an error event because the span it is in is an error span. In regards to #510, it was changed from info to error not because the level needed changed for events that occurred, but because changing it to error was a way to get auto information logged. Keeping the span as info, and adding in the pieces of information to the log where needed is the way to go imo. After checking, this does effect all other logs. I just didn't notice immediately because I forgot I muted all of routing because of this reason as weill: journal::init(
env!("CARGO_BIN_NAME"),
env!("CARGO_PKG_VERSION"),
Some(vec!["rumqttd::router::routing".to_owned()]),
); (The last param is a vec of modules to blacklist from our logging). Currently it is blacklisted because everything reports as error type. |
oh, then it makes sense. ThenTtis means we should also change
adding those pieces, like client_id/connection_id in every log defeats the purpose of span, we should think of some other way to tackle it. I will see if there is something we can do at tracing_subscriber end 👀 |
This fixes initial connection span being treated as error instead of info.
Type of change
Miscellaneous (related to maintenance)
Checklist:
cargo fmt
CHANGELOG.md
if it's relevant to the users of the library. If it's not relevant mention why.