Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write custom errors package with stack trace functionality #5239

Merged
merged 19 commits into from
May 24, 2022

Conversation

bisakhmondal
Copy link
Contributor

@bisakhmondal bisakhmondal commented Mar 16, 2022

To test

package errors_test

import (
	"testing"

	"github.com/thanos-io/thanos/pkg/errors"
)

func TestErrors(t *testing.T) {
	err := errors.New("an error occurred")
	err = errors.Wrap(err, "wraper1")
	t.Logf("error: %+v", err)
}
/home/bisakh/go/go1.17.8/bin/go tool test2json -t /tmp/GoLand/___TestErrors_in_github_com_thanos_io_thanos_pkg_errors.test -test.v -test.paniconexit0 -test.run ^\QTestErrors\E$
=== RUN   TestErrors
    errors_test.go:11: error: wraper1
        > github.com/thanos-io/thanos/pkg/errors_test.TestErrors	/home/bisakh/Desktop/OSS/thanos/pkg/errors/errors_test.go:10
        > testing.tRunner	/home/bisakh/go/go1.17.8/src/testing/testing.go:1259
        > runtime.goexit	/home/bisakh/go/go1.17.8/src/runtime/asm_amd64.s:1581
        an error occurred
        > github.com/thanos-io/thanos/pkg/errors_test.TestErrors	/home/bisakh/Desktop/OSS/thanos/pkg/errors/errors_test.go:9
        > testing.tRunner	/home/bisakh/go/go1.17.8/src/testing/testing.go:1259
        > runtime.goexit	/home/bisakh/go/go1.17.8/src/runtime/asm_amd64.s:1581
--- PASS: TestErrors (0.00s)
PASS

closes #5176

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

TODO:

  • write tests
  • full migration of existing error packages

@bisakhmondal
Copy link
Contributor Author

[WIP] How does the output look, @matej-g ?

Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall! Awesome work! 💫
Would love to see some more examples of this in action.

Just some suggestions.


// The idea of writing errors package in thanos is highly motivated from the Tast project of Chromium OS Authors. However, instead of
// copying the package, we end up writing our own simplified logic borrowing some ideas from the errors and github.com/pkg/errors.
// A big thanks to all of them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need to mention Chromium's license here too. I believe we do it for some Prometheus code that we use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the review. That's what I am also wondering too. To be honest, the current program is not an exact replication (here and there I have got rid of some unnecessary codes, interfaces, used string Builder for efficient formatting, changed the return signature of a few functions) of the Chromium OS Package.
Here's their license https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast/LICENSE

What do you think we should do here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine if we just add a line pointing to this license here, as we don't copy but rather adapt code from Chromium. But I don't feel too strongly about this either way. 🙂
WDYT @matej-g?

pkg/errors/errors.go Outdated Show resolved Hide resolved
pkg/errors/stacktrace.go Outdated Show resolved Hide resolved
pkg/errors/stacktrace.go Outdated Show resolved Hide resolved
pkg/errors/stacktrace.go Show resolved Hide resolved
pkg/errors/stacktrace.go Outdated Show resolved Hide resolved
@bisakhmondal bisakhmondal marked this pull request as ready for review March 17, 2022 10:01
Makefile Outdated
@@ -360,7 +360,8 @@ go-lint: check-git deps $(GOLANGCI_LINT) $(FAILLINT)
$(call require_clean_work_tree,'detected not clean work tree before running lint, previous job changed something?')
@echo ">> verifying modules being imported"
@# TODO(bwplotka): Add, Printf, DefaultRegisterer, NewGaugeFunc and MustRegister once exception are accepted. Add fmt.{Errorf}=github.com/pkg/errors.{Errorf} once https://github.com/fatih/faillint/issues/10 is addressed.
@$(FAILLINT) -paths "errors=github.com/pkg/errors,\
@$(FAILLINT) -paths "errors=github.com/thanos-io/thanos/pkg/errors,\
github.com/pkg/errors=github.com/thanos-io/thanos/pkg/errors,\
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should replace all the instances of /pkg/errors in a separate PR instead of unnecessarily bloating this one.

💥➜ thanos git:(pkg/errors) ✗ make lint 2>&1 | wc -l
144

Your views @matej-g?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely separate PR yes.

@metalmatze
Copy link
Contributor

Do I understand this implementation correctly, that it's mostly a shim for the stdlib errors packages that supports stack traces as part of errors now?

@bisakhmondal
Copy link
Contributor Author

Do I understand this implementation correctly, that it's mostly a shim for the stdlib errors packages that supports stack traces as part of errors now?

Hi @metalmatze, yes you are right. It extends the functionality errors package by combining

  • a stacktrace with error chaining capabilities
  • a Wrap and Wrapf functionality
  • an Errorf functionality (closest alternative fmt.Errorf)

In simple words, a very minimalistic, simple and readable replacement of github.com/pkg/errors package : )
Thank you!

@bwplotka
Copy link
Member

I wonder if we should not start by adding this error lib in https://github.com/efficientgo/tools/tree/main/core to keep util things as ultra small module?

@bisakhmondal
Copy link
Contributor Author

I wonder if we should not start by adding this error lib in https://github.com/efficientgo/tools/tree/main/core to keep util things as ultra small module?

Hi, thanks for the suggestion @bwplotka and I am sorry for the late reply. I just took a look at the package and especially the merrors pkg. Here are my few concerns.

  • One question that's repeatedly coming to my mind is what's the utility of multierrors. Go has been evolved in such a language where errors are handled right after where they have been generated. Errors are errors - no matter how small/big the impact might be in the runtime. It's definitely a bad idea to not to act where err1 has been generated and add the err2 or others. Also the program/function should return the first error it has encountered not the aggregation of all errors (as this is implied, if step1 fails, it is highly likely that step2 and step... will fail).
//  merr := merrors.New(err1)
//  merr.Add(err2, errOrNil3)
//  for _, err := range errs {
//    merr.Add(err)
//  }
//  return merr.Err()
  • As per the actual issue, if I am not wrong we are looking for an alternative to github.com/pkg/errors. I am afraid, this package doesn't suffices so. There is no wrap or unwrap method implemented in the multierrors. To use this package we have to again fall back to errors.New of fmt.Errorf package
    (with something like merrors.New(errors.New("the extra context"), err).

    Even if we do so, the Is and As method will be inefficient in this case,
    https://github.com/efficientgo/tools/blob/fe763185946be83b20da626605319733bd7f97cb/core/pkg/merrors/errors.go#L116
    It will iterate through the extra context which we have to pass as an error(due to the design of the package).

  • Taking from the issue itself,

There is no stack trace functionality which we use heavily https://github.com/thanos-io/thanos/blob/main/cmd/thanos/main.go#L131

  • And finally, the methods exposed from the package will completely destroy backward compatibility. It's not just a simple change in the import path anymore (from 'pkg/errors to 'github.com/thanos-io/thanos/pkg/errors). We have to change each and every instance of the package where err has been used. In the initial days after the migration, there will be certain development toil among the community to adapt to something new.

Hey, I am new to the community and I don't have much context about the discussions and decisions : ) Here I just tried to do some research and share my findings if the efficientgo/tools meets the requirements. I think no for the aforementioned reasons. WDYT?

Thanks : )

@saswatamcode
Copy link
Member

saswatamcode commented Mar 19, 2022

Hey @bisakhmondal , thanks for the detailed analysis! :)

To address some of your concerns,

One question that's repeatedly coming to my mind is what's the utility of multierrors. Go has been evolved in such a language where errors are handled right after where they have been generated. Errors are errors - no matter how small/big the impact might be in the runtime. It's definitely a bad idea to not to act where err1 has been generated and add the err2 or others. Also the program/function should return the first error it has encountered not the aggregation of all errors (as this is implied, if step1 fails, it is highly likely that step2 and step... will fail).
// merr := merrors.New(err1)
// merr.Add(err2, errOrNil3)
// for _, err := range errs {
// merr.Add(err)
// }
// return merr.Err()

Hmm, I'd say there is definitely utility in having merrors package. :)
Even in Go, we might not always want to handle an error as soon as it occurs! Sometimes, we may just want to return all errors in one go, and this package is definitely useful for that.

For example, say in our docs CI, we have a tool running called mdox, which checks each markdown file and ensures that they are correctly formatted and have correct links. You run it once via make docs and it outputs all the errors that might exist in all the files.

If we were to just return the very first error encountered, users would have to run it again and again until they have fixed each error, which is definitely not optimal and would be a painful process.

There are use cases for this in Thanos too, for example, the tools rule-check command which checks if all the rule files passed are valid or not. See

func checkRulesFiles(logger log.Logger, patterns *[]string) error {

You'd also find such cases scattered throughout the codebase.

So there are definitely use cases for this! I think there are also other implementations of this like https://github.com/hashicorp/go-multierror.

But a question I have for your point is, why does the existence of merrors matter here? This PR is just a replacement for the errors package that we currently use, not related to the way we handle errors. So the suggestion is to add it to our utils repo efficientgo/tools and then use it from there. We don't really need to change existing merrors, but add a new errors package, right?

As per the actual issue, if I am not wrong we are looking for an alternative to github.com/pkg/errors. I am afraid, this package doesn't suffices so. There is no wrap or unwrap method implemented in the multierrors. To use this package we have to again fall back to errors.New of fmt.Errorf package
(with something like merrors.New(errors.New("the extra context"), err).
Even if we do so, the Is and As method will be inefficient in this case,
https://github.com/efficientgo/tools/blob/fe763185946be83b20da626605319733bd7f97cb/core/pkg/merrors/errors.go#L116
It will iterate through the extra context which we have to pass as an error(due to the design of the package).

Not sure what you mean here. I think @bwplotka's point was to add this as a separate "errors" package in the tools repo. Not as an extension or modification to the existing merrors one.

And finally, the methods exposed from the package will completely destroy backward compatibility. It's not just a simple change in the import path anymore (from 'pkg/errors to 'github.com/thanos-io/thanos/pkg/errors). We have to change each and every instance of the package where err has been used. In the initial days after the migration, there will be certain development toil among the community to adapt to something new.

Hmm. Why would it destroy backward compatibility? We aren't changing/removing any of the existing functionality, right?

If this package is added to efficientgo/tools repo, iiuc, it would still be a change in import path from pkg/errors to github.com/efficientgo/tools/pkg/errors + a version bump in go.mod. There wouldn't be any difference in dev toil when compared with adding a custom errors package that this PR is proposing.

I might be understanding something differently about your points here, so correct me if I'm wrong :).

@bisakhmondal
Copy link
Contributor Author

But a question I have for your point is, why does the existence of merrors matter here? This PR is just a replacement for the errors package that we currently use, not related to the way we handle errors. So the suggestion is to add it to our utils repo efficientgo/tools and then use it from there. We don't really need to change existing merrors, but add a new errors package, right?

Ahh, Thanks for the clarification @saswatamcode. I misinterpreted @bwplotka's comment thinking we are going to use the error package (merrors) that is currently present in the tools package.

Definitely, there are certain use cases where combining multiple errors makes sense. As I said, I misinterpreted that we are going to make this a default throughout the thanos project which actually doesn't make sense.

As per the actual issue, if I am not wrong we are looking for an alternative to github.com/pkg/errors. I am afraid, this package doesn't suffices so. There is no wrap or unwrap method implemented in the multierrors. To use this package we have to again fall back to errors.New of fmt.Errorf package
(with something like merrors.New(errors.New("the extra context"), err).
Even if we do so, the Is and As method will be inefficient in this case,
https://github.com/efficientgo/tools/blob/fe763185946be83b20da626605319733bd7f97cb/core/pkg/merrors/errors.go#L116
It will iterate through the extra context which we have to pass as an error(due to the design of the package).

Not sure what you mean here. I think @bwplotka's point was to add this as a separate "errors" package in the tools repo. Not as an extension or modification to the existing merrors one.

Yep, I got it now : ) I am totally fine with keeping the error package as a part of thanos project or inside efficientgo/tools. I have no strong opinion here : )

@bwplotka
Copy link
Member

Not sure what you mean here. I think @bwplotka's point was to add this as a separate "errors" package in the tools repo. Not as an extension or modification to the existing merrors one.

Correct. I just mentioned utility. (:

@bwplotka
Copy link
Member

But we can start small with just implementing it here for now

@bisakhmondal
Copy link
Contributor Author

But we can start small with just implementing it here for now

Great! Would you mind giving it a quick look and sharing some reviews? I'll write the test suite then : )
Thank you!

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job. I have a few proposals - feel free to challenge non sensible former ideas ppl used to do. Especially with Wrap vs Wrapf is something we could improve 🤔

Let's make sure it's clear and ready.

We also need some at least basic tests. Thanks!

Makefile Outdated
@@ -360,7 +360,8 @@ go-lint: check-git deps $(GOLANGCI_LINT) $(FAILLINT)
$(call require_clean_work_tree,'detected not clean work tree before running lint, previous job changed something?')
@echo ">> verifying modules being imported"
@# TODO(bwplotka): Add, Printf, DefaultRegisterer, NewGaugeFunc and MustRegister once exception are accepted. Add fmt.{Errorf}=github.com/pkg/errors.{Errorf} once https://github.com/fatih/faillint/issues/10 is addressed.
@$(FAILLINT) -paths "errors=github.com/pkg/errors,\
@$(FAILLINT) -paths "errors=github.com/thanos-io/thanos/pkg/errors,\
github.com/pkg/errors=github.com/thanos-io/thanos/pkg/errors,\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely separate PR yes.

.golangci.yml Outdated Show resolved Hide resolved
pkg/errors/errors.go Outdated Show resolved Hide resolved

// Errorf creates a new error with the given message and a stacktrace in details.
// An alternative to fmt.Errorf function.
func Errorf(format string, args ...interface{}) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - again - it's our package, we can innovate.

What about only

errors.Newf and errors.Wrapf?

Or even naming it without f but supporting fmt? No harm here, but simper API for everyone. So:

errors.New(format string, args ...interface{}) and errors.Wrap(err, format string, args ...interface{}) would work too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as it's going to be our custom error package, having simpler APIs would be very useful in the long run.
I was just a bit worried about the performance penalty (though it's linear O(N) (ref) )for extra iteration over the format string. So I ran a quick benchmark and the results are equivalent in terms of memory and time.

func benchWrap(errorMsg *string, b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = errors.Wrap(errors.New("random error"), *errorMsg)
	}
}

func benchWrapf(errorMsg *string, b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = errors.Wrap(errors.New("random error"), *errorMsg)
	}
}

// >>> len("something terrible has happened.")
// 32

var (
	len32  = "something terrible has happened."
	len64  = "something terrible has happened.something terrible has happened."
	len128 = "something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened."
	len512 = "something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened.something terrible has happened."
)

func BenchmarkWrap_32(b *testing.B)  { benchWrap(&len32, b) }
func BenchmarkWrapf_32(b *testing.B) { benchWrapf(&len32, b) }

func BenchmarkWrap_64(b *testing.B) { benchWrap(&len64, b) }
func BenchmarkWrapf_64(b *testing.B) {
	benchWrapf(&len64, b)
}

func BenchmarkWrap_512(b *testing.B) {
	benchWrap(&len512, b)
}
func BenchmarkWrapf_512(b *testing.B) {
	benchWrapf(&len512, b)
}

//🔥➜ errors git:(pkg/errors) ✗ go test -bench=. -benchmem
//goos: linux
//goarch: amd64
//pkg: github.com/thanos-io/thanos/pkg/errors
//cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
//BenchmarkWrap_32-8        909074              1301 ns/op             256 B/op          2 allocs/op
//BenchmarkWrapf_32-8       942794              1297 ns/op             256 B/op          2 allocs/op
//BenchmarkWrap_64-8        936003              1289 ns/op             256 B/op          2 allocs/op
//BenchmarkWrapf_64-8       941866              1288 ns/op             256 B/op          2 allocs/op
//BenchmarkWrap_512-8       964816              1286 ns/op             256 B/op          2 allocs/op
//BenchmarkWrapf_512-8      883635              1285 ns/op             256 B/op          2 allocs/op

As you have suggested, it's no harm to trim the extra f from this package : )
Updating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact:
actually, it's faster and more memory-optimized than github/pkg/errors ^_^

goos: linux
goarch: amd64
pkg: github.com/thanos-io/thanos/pkg/errors
cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkWrap_32-8                        776246              1312 ns/op             256 B/op          2 allocs/op
BenchmarkWrap_pkg_errors_32-8             805689              1498 ns/op             640 B/op          7 allocs/op
BenchmarkWrap_64-8                        767688              1324 ns/op             256 B/op          2 allocs/op
BenchmarkWrap_pkg_errors_64-8             831584              1523 ns/op             640 B/op          7 allocs/op
BenchmarkWrap_512-8                       952338              1342 ns/op             256 B/op          2 allocs/op
BenchmarkWrap_pkg_errors_512-8            813520              1508 ns/op             640 B/op          7 allocs/op

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

After some thinking, let's keep f word suffix so we are consistent with some lints/tooling that tries to verify formatters. Just removing non f and being able to produce format string without variables is fine. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also next time, just in case use

func benchWrapf(errorMsg *string, b *testing.B) {
       b.ResetAllocs()
	for i := 0; i < b.N; i++ {
		_ = errors.Wrap(errors.New("random error"), *errorMsg)
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks

pkg/errors/errors.go Outdated Show resolved Hide resolved
Comment on lines +116 to +128
// Is is a wrapper of built-in errors.Is. It reports whether any error in err's
// chain matches target. The chain consists of err itself followed by the sequence
// of errors obtained by repeatedly calling Unwrap.
func Is(err, target error) bool {
return errors.Is(err, target)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if alias var Is = errors.Is would not work? (and same for As).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work, updating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry, I have reverted back to the old one.
Reason: Though using var Is = errors.Is would definitely make things work but users will have a slightly poor IDE experience as the ide will treat those functions as variables (as actually, they are in the modified pkg). So the auto-completion works a bit weird way
see the screenshot
image

I dug in a little bit to find an alternative approach to tackle this, but it seems popular packages do use the wrapping while exposing their client-side APIs from the internal packages.
ref: https://github.com/temporalio/sdk-go/blob/fd0d1eb548eb0621a5395581cfe2c418704b007c/client/client.go#L435-L476

I hope you are okay with it. @bwplotka?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but worth adding comment why we did this this way 🙃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure : )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I know why it did not work for you. We can do type Is = errors.Is I think. Let's merge and iterate (:

Copy link
Contributor Author

@bisakhmondal bisakhmondal Apr 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi sure 👍 thanks for the suggestion, we can tackle it on the next pr. On second thought I doubt if Go will allow putting a new type over a function definition (for function signature it can be done).

func newStackTrace() stacktrace {
const stackDepth = 16 // record maximum 16 frames (if available)

pc := make([]uintptr, stackDepth)
Copy link

@sthaha sthaha Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation has the same issue as pkg/errors i.e. the pc escapes to heap thus the entirety of the 16 uintptr is kept in memory even if only the stack is only 1 call frame deep. I have explained the issue in detail here; based on a real-world scenario.

Please try go build -gcflags='-m -m' ./pkg/errors/stacktrace.go and you would find

pkg/errors/stacktrace.go:21:12: make([]uintptr, stackDepth) escapes to heap:
pkg/errors/stacktrace.go:21:12:   flow: pc = &{storage for make([]uintptr, stackDepth)}:
pkg/errors/stacktrace.go:21:12:     from make([]uintptr, stackDepth) (spill) at pkg/errors/stacktrace.go:21:12
pkg/errors/stacktrace.go:21:12:     from pc := make([]uintptr, stackDepth) (assign) at 

The solution is to allocate the buffer pc on stack like the original implementation and then copy whats needed off it to prevent escaping to heap.

Suggested change
pc := make([]uintptr, stackDepth)
const maxDepth = 16
var pcs [maxDepth]uintptr. // allocate on stack
n := runtime.Callers(3, pcs[:])
st := make(stack, n)
copy(st, pcs[:n])
return st

Another thought is if we really need to hold on to the entire CallFrame until String() is called? At the expense of some compute, we could calculate Stacktrace string immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice nitpick and great catch @sthaha.

the pc escapes to heap thus the entirety of the 16 uintptr is kept in memory even if only the stack is only 1 call frame deep.

Yes, you are right - there is no point in holding an extra chunk of unused heap memory, And I think you are copying to another temporary slice because even though we are returning stacktrace(pc[:n]) the capacity of PC still remains 16 (allocated memory), right?

But this additional copy adds some extra complexity : ) (though it's very minor and negligible as we are dealing with length of 16) From escape analysis the function complexity gets increased from 88 -> 96

./stacktrace.go:20:6: cannot inline newStackTrace: function too complex: cost 96 exceeds budget 80
.
.
./stacktrace.go:20:6: cannot inline newStackTrace: function too complex: cost 88 exceeds budget 80

Since go1.2 we have 3 index slice capability where the third param can define the capacity of the newly created slice.

So just updating the return statement to return stacktrace(pc[:n:n]) would yield the same effect of the suggestion you proposed.

Thanks a lot. That was really awesome.

Another thought is if we really need to hold on to the entire CallFrame until String() is called? At the expense of some compute, we could calculate Stacktrace string immediately.

I think we shouldn't do it for the following reasons

  • Its compute expensive - runtime needs to retrieve caller information
  • More memory usage - the String method is meant for human consumption so naturally, this adds a lot of extra information, text etc compared to the 16 element slice of type unitptr. During error changing it would be worse.
  • Not every error needs the stacktrace. For specific format verbs like %+v stacktrace gets dumped recursively (only then String gets called).

Copy link

@sthaha sthaha Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since go1.2 we have 3 index slice capability where the third param can define the capacity of the newly created slice.

💯 .. Thank you, I wasn't aware :)

I think we shouldn't do it for the following reasons ...

Its compute expensive - runtime needs to retrieve caller information

Have you tried a benchmark?

More memory usage - the String method is meant for human consumption so naturally, this adds a lot of extra information, text etc compared to the 16 element slice of type unitptr. During error changing it would be worse.

What I meant was to only hold on to information that you need than have uintptr which now points to the callframe and IIUC, holding onto the callframe will now prevent GC from cleaning the callframe info until the error is garbage collected as well.

It may be worth seeing the inuse allocation if we don't hold on to the callframes.

Copy link
Contributor Author

@bisakhmondal bisakhmondal Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't do it for the following reasons ...

Its compute expensive - runtime needs to retrieve caller information

Have you tried a benchmark?

IIUC, in the new approach that you are suggesting we should call the String immediately at the expense of an extra call which might not be required if the error is not logged with an "%+v" format verb. So we are adhering to extra computation which might not be of any use. So obviously it will be expensive in time against the current approach as the current one only calls string when it's required.

More memory usage - the String method is meant for human consumption so naturally, this adds a lot of extra information, text etc compared to the 16 element slice of type unitptr. During error chaining it would be worse.

What I meant was to only hold on to information that you need than have uintptr which now points to the callframe and IIUC, holding onto the callframe will now prevent GC from cleaning the callframe info until the error is garbage collected as well.

Now coming to the memory footprint, IMHO storing unstructured data (the stack trace output string) is generally a bad idea. If we change the output string format from fmt.Sprintf("> %s\t%s:%d\n", frame.Func.Name(), frame.File, frame.Line) this to something else memory footprint gets changed. So I think this optimization won't be valid in longer run.

After giving it a thought, I think it's a tradeoff, why?
Memory usage:

  • current approach: 16*4 bytes uintptr + unreleased memory from gc (not sure what it stores on that PC address, might be some function related metadata to populate the callFrames. definitely releases all the resources used inside the function)
  • proposed approach: large unstructured string + still some unreleased memory (not sure when will the next cycle of go GC will kick off and it's dependent on the config).

I am not sure if we can benchmark this anyhow or not.

the error is garbage collected as well.

If you see, in Go world err has the comparatively shortest lifespan than anything else. In a well written program they gets handled immediately by being returned to the caller function or logged into the sink. So I think we are good here. After all it's an internal package and if we found some untoward runtime behaviour we can always do different optimization.

Thank you : )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see, in Go world err has the comparatively shortest lifespan than anything else.

Not always. Check our fixed issue: #5257

Anyway, I asked @sthaha to remind us about the optimization emperor did - to make sure we are aware. As usually we can iterate over it, we won't do it perfectly over time - the main thing is to get APIs as best as possible - it's hard to change them later.

@bisakhmondal
Copy link
Contributor Author

It looks like the Lift and DCO CIs are pending forever. Does this take this long, especially the code analysis by lift? Is there a way we can optimize it? I'd love to.

Hi guys, could you please take another look at this PR when you have some time : )

cc @bwplotka

bwplotka
bwplotka previously approved these changes Apr 2, 2022
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an amazing job. LGTM. Before merge my only question is really about doing: Newf and Wrapf (A) vs New and Wrap (B).

I like the fact that with no-f (B) version it's cleaner and we show that we don't ever support Wrapper/New without sprintf like formatting. On other hand people got used to things, plus there might be nice tooling which work only if method is f suffix (don't have example - I saw linters for Printf formatting errors).

I think I am leaning towards (A) actually... Thoughts?

// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.

//nolint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why nolint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tast project

spellchecker thinks the project name should be Taste instead of the keyword Tast 🙃

pkg/errors/errors.go Outdated Show resolved Hide resolved
@bwplotka
Copy link
Member

bwplotka commented Apr 2, 2022

Let's try to push (or repush) another commit for DCO to unstuck. You can ignore lift - it's not required and rarely working 🙃

@bisakhmondal
Copy link
Contributor Author

This is an amazing job. LGTM. Before merge my only question is really about doing: Newf and Wrapf (A) vs New and Wrap (B).

I like the fact that with no-f (B) version it's cleaner and we show that we don't ever support Wrapper/New without sprintf like formatting. On other hand people got used to things, plus there might be nice tooling which work only if method is f suffix (don't have example - I saw linters for Printf formatting errors).

I think I am leaning towards (A) actually... Thoughts?

Hi, thanks for the nice feedback and the reviews.
I am fine with both approaches as through benchmarks we have seen that they are of equivalent performance.

Yes, to support approach A, what you have said related to developers' experience and the linters definitely make sense. People are accustomed to having both f and non f APIs and having only non f APIs (approach B) will definitely incur some adaptation cost and multiple visit to the package codes to understand what/how actually it does things.

Okay then, do let me know your final thoughts - we are going with approach A, right? I'll make the necessary changes ASAP : )
Thank you!

@bwplotka
Copy link
Member

bwplotka commented Apr 7, 2022

Yes, I think A is safer. 🤗

@bisakhmondal
Copy link
Contributor Author

Yes, I think A is safer.

Sure, updating : )

// with a stacktrace containing recent call frames.
//
// If cause is nil, this is the same as New.
func Wrap(cause error, msg string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we left our both non f and f methods at the end? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I thought we are focussing on end-users experience that's why ended up keeping both.
I have changed my mind, let's simplify things and stick to only the f versions 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mission accomplished : )
image

bwplotka
bwplotka previously approved these changes Apr 11, 2022
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM, just minor nits. We have opportunity to simplify this package, but we ended up having both New, Errorf, Wrap and Wrapf. Again, there is an opportunity to simplify this, so reviewers do not need to be a pain and saying "Why you used Errrof("non formattable error") and not New (and similar for Wrap)

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>
auto-merge was automatically disabled April 23, 2022 13:56

Head branch was pushed to by a user without write access

@bisakhmondal bisakhmondal dismissed stale reviews from bwplotka and matej-g via 5f0c859 April 23, 2022 13:56
@bisakhmondal
Copy link
Contributor Author

I think this needs rebasing, the docs check will keep failing due to an outdated link in the docs disappointed cc @bisakhmondal

Done. Thanks to you both for the awesome review : )

@bisakhmondal
Copy link
Contributor Author

bisakhmondal commented Apr 30, 2022

Hi Guys, all the CIs (well, as usual except for the lift) have passed. Please approve and merge this. Then I'll proceed with the rest of the refactoring : )

@bisakhmondal
Copy link
Contributor Author

Ping @bwplotka @matej-g
Let's merge this guys : )

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the massive lag! KubeCon toil. LGTM!

Let's try to use it and improve on the way. Ideally at some point move to efficientgo/core - or better - separate repo for others to import it.

@bwplotka bwplotka merged commit 18b4dc3 into thanos-io:main May 24, 2022
@bwplotka
Copy link
Member

Thank you! 🎉 Amazing work!

@bisakhmondal
Copy link
Contributor Author

Thank you everyone for those awesome reviews :)

openshift-merge-robot pushed a commit to stolostron/thanos that referenced this pull request Dec 8, 2022
* Remove debug line (#5245)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* e2e: fix compact test's flakiness (#5246)

Fix the compact test's by running this sub-test sequentially. The
further steps depend on this test's results so it's wrong to run it as a
sub-test.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* bump prometheus version to v2.33.5 (#5256)

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* info: Return store info only when the service is ready (#5255)

* return store info only when the service is ready

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Merge release 0.25 to main (#5210)

* Cut 0.25.0-rc.0 (#5184)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Cut v0.25.0 (#5209)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Create v0.25.1 built with Go 1.17.8 (#5226)

The binaries published with this release are built with Go1.17.8 to
avoid
[CVE-2022-24921](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24921).

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* *: Cut 0.25.2 rc.0 (#5247)

* fix: add null check to exemplar data (#5202)

Signed-off-by: Thomas Mota <tmm@danskecommodities.com>

* Ruler: Fix WAL directory in stateless mode (#5242)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG, VERSION

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updates busybox SHA (#5234)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

Co-authored-by: Tomás Mota <tomasrebelomota@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* Cut v0.25.2

Signed-off-by: Matej Gera <matejgera@gmail.com>

Update tutorials

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Matthias Loibl <mail@matthiasloibl.com>
Co-authored-by: Tomás Mota <tomasrebelomota@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* Implement GRPC query API (#5250)

With the current GRPC APIs, layering Thanos Queriers results in
the root querier getting all of the samples and executing the query
in memory. As a result, the intermediary Queriers do not do any
intensive work and merely transport samples from the Stores to the
root Querier.

When data is perfectly sharded, users can implement a pattern where
the root Querier instructs the intermediary ones to execute the queries
from their stores and return back results. The results can then be
concatenated by the root querier and returned to the user.

In order to support this use case, this commit implements a GRPC API
in the Querier which is analogous to the HTTP Query API exposed
by Prometheus.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Change error cleanup in `objstore.DownloadDir` to delete files not destination dir (#5229)

* Change error cleanup in objstore.DownloadDir to delete files not directories

Dst is always a directory. If any file after the first fails to download,
the cleanup will fail because the destination already contains at least one file.
This commit changes the cleanup logic to clean up successfully downloaded files one by one
instead of attempting to clean up the whole dst directory.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add cleanup of root dst directory.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add unit test for cleanup of DownloadDir

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Fix linter

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Update index.html (#5264)

* Add SumUp logo to adopters (#5267)

Signed-off-by: Guilherme Souza <101073+guilhermef@users.noreply.github.com>

* receive: Added tenant ID  error handling of remote write requests. (#5269)

Plus better explanation.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Add TIXnGO logo to adopters (#5273)

Signed-off-by: Pierre Hanselmann <pierre.hanselmann@gmail.com>

* Fix miekgdns resolver to work with CNAME records too (#5271)

* Fix miekgdns resolver to work with CNAME records too

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove unused context

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Update pkg/discovery/dns/miekgdns/resolver.go

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Lucas Servén Marín <lserven@gmail.com>

Co-authored-by: Lucas Servén Marín <lserven@gmail.com>

* UI: Remove old ui (#5145)

* remove old ui

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* add changelog

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* update assets

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* Updates busybox SHA (#5283)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* build(deps): bump moment from 2.29.1 to 2.29.2 in /pkg/ui/react-app (#5274)

Bumps [moment](https://github.com/moment/moment) from 2.29.1 to 2.29.2.
- [Release notes](https://github.com/moment/moment/releases)
- [Changelog](https://github.com/moment/moment/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/moment/moment/compare/2.29.1...2.29.2)

---
updated-dependencies:
- dependency-name: moment
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: fix URLs preventing generation and unblock CI (#5285)

* docs: fix Ian Billett's GitHub handle

I noticed that CI was failing [0] for PR
https://github.com/thanos-io/thanos/pull/5284 because Ian had changed
his GitHub handle from @ianbillett to @bill3tt. This commit fixes this.

[0] https://github.com/thanos-io/thanos/runs/6050355497?check_suite_focus=true#step:5:135

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* docs: fix broken links to GitHub docs

Currently, documentation generation is failing because mdox can't fetch
some GitHub documentation pages since the URLs for the help content has
changed. This commit updates the links to use the correct URLs.

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* MAINTAINERS.md: regenerate

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>

* UI: Update vulnerable dependencies (#5233)

* refactor global window typings

Use declaration merging for better window types

Signed-off-by: Gabriel Bernal <gbernal@redhat.com>

* bump vulnerable react-scripts version

Signed-off-by: Gabriel Bernal <gbernal@redhat.com>

* Add Vestiaire Collective as adopter (#5289)

Signed-off-by: claude ebaneck <claudeforlife@gmail.com>

Co-authored-by: claude ebaneck <claude.ebaneck@vestiairecollective.com>

* Implement Query API discovery (#5291)

A recent commit (#5250) added a GRPC API to Thanos Query which allows
executing PromQL over GRPC. This API is currently not discoverable
through endpointsets which makes it hard for other Thanos components
to use it.

This commit extends endpointsets with a GetQueryAPIClients method
which returns Query API clients to all components which support
this API.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Added support for ppc64le (#5290)

* Added support for ppc64le

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated Changelog

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated promu & protoc

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Updated Makefile comment

Signed-off-by: Marvin Giessing <marvin.giessing@gmail.com>

* Added target API tests (+goleak). (#5260)

Attempted to repro https://github.com/thanos-io/thanos/issues/5257, but no good luck.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Revert "Added target API tests (+goleak). (#5260)" (#5297)

This reverts commit 955ea6dcae2529ad5b5b97a6a11150a5906d775a.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Use correct filesystem/network path separators when uploading blocks (#5281)

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* query-frontend: Don't cache request with dedup=false  (#5300)

* query-frontend: Added repro for dedup affecting precision of querying.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* QFE does not cache request with dedup=false.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Move info about queries that skip cache logic to docs

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update CHANGELOG

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Run docs formatter

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix e2e tests where caching logic is desired

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* mixin: Fix typo in ThanosCompactHalted alert (#5306)

Signed-off-by: Pedro Araujo <pedro.araujo@saltpay.co>

* Avoid starting goroutines for memcached batch requests before gate (#5301)

Use the doWithBatch function to avoid starting goroutines to fetch batched
results from memcached before they are allowed to run via the concurrency
Gate. This avoids starting many goroutines which cannot make any progress
due to a concurrency limit.

Fixes #4967

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Cut readme for 0.26 (#5311)

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* Reviewed and updated Changelog for 0.26-rc0 (#5313)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* Cut 0.26.0-rc.0 set version correctly (#5317)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

Co-authored-by: Wiard van Rij <wvanrij@roku.com>

* docs: Fix broken link to introduction blog (#5319)

Signed-off-by: jmjf <jamee.mikell@gmail.com>

* Ensure memcached batched requests handle context cancelation (#5314)

* Ensure memcached batched requests handle context cancellation

Ensure that when the context used for Memcached GetMulti is cancelled,
getMultiBatched does not hang waiting for results that will never be
generated (since the batched requests will not run if the context has
been cancelled).

Fixes an issue introduced in #5301

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Lint fixes

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review changes: run batches unconditionally

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* stalebot: add generic label to avoid stalebot (#5322)

Add a generic label which tells stalebot not to close issues marked with
it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Use proper replicalabels in GRPC Query API (#5308)

The GRPC Query API uses only the replica labels coming from the
RPC request and ignores the ones configured when starting the querier.

This commit ensures that the API falls back on the preconfigured
replica labels when they are not provided in the request.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* groupcache: reduce log severity (#5323)

Sometimes certain operations can fail with some error(-s) being expected
e.g. a deletion marker might or might not exist. Thus, these log lines
could get triggered even though nothing bad is happening. Since the
expected errors are known only at the very end, near the call site, and
because `error`s are already logged in other places, and because these
Fetch()/Store() functions are working in best-effort scenario, I propose
reducing the severity of these log lines to `debug`.

Fixes https://github.com/thanos-io/thanos/issues/5265.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Update release process (#5325)

* update release process

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Add info about VERSION file

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* query-frontend: improve docs on requestes excluded from cache (#5326)

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* cut release 0.26.0 (#5330)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Updates busybox SHA (#5336)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* receive: fix deadlock on interrupt in routerOnly mode (#5339)

* fix receive router deadlock on interrupt

Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>

* Update changelog

Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>

* docs: Updated information about our community call. (#5309)

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* reloader: Force trigger reload when config rollbacked (#5324)

* Add Cache metrics to groupcache (#5352)

Add metrics about the hot and main caches[0].
* Number of bytes in each cache.
* Number of items in each cache.
* Counter of evictions from each cache.

[0]: https://pkg.go.dev/github.com/vimeo/galaxycache#CacheStats

Signed-off-by: SuperQ <superq@gmail.com>

* e2e: Refactored service helpers to be consistent with new API. (#5348)

* test: Added Alert compatibilty test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Tmp.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Refactored service helpers for newest e2e version.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed alert combatibiltiy test for now.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed lint.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed lint2.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed nginx service.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Refactored ruler.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* fixes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed compactor.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* What about now?

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* groupcache: fix handling of slashes (#5357)

Use https://github.com/julienschmidt/httprouter#catch-all-parameters for
the groupcache route otherwise slashes in the cache's key gets
interpreted by the router and thus groupcache's function never gets
invoked, and all clients get 404.

Remove test regarding cache hit because now Thanos Store during test
constantly generates cache hits due to 1s delay between block
information refreshes.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Adds more info about the formatting part. (#5347)

* Adds more info about the formatting part. Closes #5282

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* adds extra newline

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Update promdoc to solve #5344 (#5345)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* e2e: Refactored Receive Builder to be consistent with other helpers. (#5358)

* e2e: Refactored Receive Builder to be consistent with other helpers.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Updates busybox SHA (#5365)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* e2e: Fixed exemplar support in receive helper. (#5372)

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Enforce memcached concurrency limit with unbatched requests (#5360)

* Enforce memcached concurrency limit with unbatched requests

This ensures that requests that are _not_ split into batches still count
towards the concurrency limit that the client enforces.

This fixes an issue introduced in #5301

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Lint fix

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* docs: fix link (#5379)

I think I've found a replacement for the dead link.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* cache: do not copy data in groupcache (#5378)

Add a unsafe codec which uses the given byte slices directly to avoid
copying - we are doing ioutil.ReadAll() either way so there is no need
to copy anything.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* fix ruler send empty alerts (#5377)

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Add custom `errors` package with stack trace functionality (#5239)

* feat: a simple stacktrace utility

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* feat: custom errors package with new, errorf, wrapping, unwrapping and stacktrace

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* chore: update existing errors import (small subset)

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* chore: update comments

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add errors into skip-files linter config

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* intoduce UnwrapTillCause to suffice the limitation of Unwrap

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* Revert "chore: update existing errors import (small subset)"

This reverts commit d27f0177fe6c8a357ba10e4ac8bfee87c8bf985c.

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* revert makefile && golangcilint file

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* apply PR feedbacks

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* stacktrace and errors test

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* fix typo

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update stacktrace testing regex

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add lint ignore for standard errors import inside errors pkg

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* [test files] add copyright headers

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* add no lint to avoid false misspell detection of keyword Tast

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update stacktrace output test line number with regex pattern

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* return pc slice with reduced capacity

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* segregate formatted vs non formatted methods

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* update with only f functions

Signed-off-by: Bisakh Mondal <bisakhmondal00@gmail.com>

* Group memcached keys based on server when performing batch gets (#5356)

* Group memcached keys based on server when performing batch gets

Order and group keys during batch get operations based on the memcached
server they will be sharded to. This reduces the number of connections
that must be made within each batch of get operations.

Fixes #5353

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review changes

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Fix error in testutil method added

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Code review: comments for selector interface

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* QueryFrontend: pre-compile regexp (#5383)

* pre compile regexp

Signed-off-by: Jin Dong <djdongjin95@gmail.com>

* rename oppattern to labelvaluespattern

Signed-off-by: Jin Dong <djdongjin95@gmail.com>

* [FEAT] adding thanos consul blogpost (#5387)

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Fix empty $externalLabels when templating labels in rule. (#5394)

Signed-off-by: Rostislav Benes <r.dee.b.b@gmail.com>

Co-authored-by: Rostislav Benes <r.dee.b.b@gmail.com>

* support series relabeling on Thanos receiver (#5391)

* support series relabeling on Thanos receiver

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* add changelog

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix lint

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update lint

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix e2e test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix relabel config pass

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* cleanup white space

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* address review comments

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* address comments

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update comment

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* Expose GatherFileStats. (#5400)

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Rule: Error out earlier when building alertmanager config (#5405)

* Error out earlier when building alertmanager config

Signed-off-by: Jéssica Lins <jessicaalins@gmail.com>

* Add test case for empty host

Signed-off-by: Jéssica Lins <jlins@redhat.com>

* [5130] [.*:] Upgrade Minio used for local development and e2e tests (#5392)

* add updated bingo .gitignore

Signed-off-by: B0go <victorbogo@icloud.com>

* update bingo minio version to commit 91130e884b5df59d66a45a0aad4f48db88f5ca63

Signed-off-by: B0go <victorbogo@icloud.com>

* trigger CI

Signed-off-by: B0go <victorbogo@icloud.com>

* Submit a proposal for vertical query sharding (#5350)

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* query: Close() after using query (#5410)

* query: Close() after using query

This should reduce memory usage because Close() returns points back to a
sync.Pool.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* CHANGELOG: add item

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* query: call Close() in gRPC API too

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* avoided potential panic due to divide by 0 (#5412)

Signed-off-by: Aditi Ahuja <ahuja.aditi@gmail.com>

* sidecar/compact/store/receiver - Add the prefix option to buckets (#5337)

* Create prefixed bucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* started PrefixedBucket tests

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* finish objstore tests

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Simplify string removal logic

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Test more prefix cases on PrefixedBucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Only use a prefixedbucket if we have a valid prefix

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add single unit test for prefixedBucket prefix

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* test other prefixes on UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add remaining methods to UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add prefix to docs examples

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Simplify Iter method

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* add prefix explanation to S3 docs

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Conclusion of prefix sentence on docs

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Use DirDelim instead of magic string

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add log when using prefixed bucket

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove "@" from test string to make them simpler

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* fix BucketConfig Config type - back to interface

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add changelog

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add missing checks in UsesPrefixTest

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* fix linter and test errors

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Add license to new files

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove autogenerated docs

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove duplicated transformation of string->[]byte

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add prefixed bucket on all e2e tests for S3

The idea is that if it works, we can add for all other providers.
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Add e2e tests using prefixed bucket to all providers

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* refactor: move validPrefix to prefixed_bucket logic

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Enhance the documentation about prefix.

Signed-off-by: jademcosta <jademcosta@gmail.com>

* Fix format
Signed-off-by: jademcosta <jademcosta@gmail.com>

* Add prefix entry on bucket config example

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Removing redundancies on prefix checks and tests

We already check if the prefix if not empty when creating the bucket.

Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove redundant YAML unmarshal
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove unused parameter
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* Remove docs that should be auto-geneated
Signed-off-by: jademcosta <jade.costa@nubank.com.br>

* refactor: move prefix to config root level

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* add auto generated docs

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* fix changelog

Signed-off-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

Co-authored-by: Maria Eduarda Duarte <dudammduarte@yahoo.com.br>

* Ruler: Change default evaluation interval to 1m (#5417)

* Change default eval interval to 1m

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updates busybox SHA (#5423)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* receive: Added Ketamo Consistent hashing (#5408)

* Add support for consistent hashing in receivers

This commit adds support for distributing series in Receivers using
consistent hashing based on the libketama algorithm.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use require package for test assertions

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Rename algorithm from consistent to ketama

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* S3: Add config option to enforce the minio DNS lookup (#5409)

* Add config option to enforce the minio DNS lookup

Signed-off-by: Jakob Hahn <jakob.hahn@hetzner.com>

* Useenums instead of boolean for bucket_lookup_type

Signed-off-by: Jakob Hahn <jakob.hahn@hetzner.com>

* Expose tsdb status in receiver (#5402)

* Expose tsdb status in receiver

This commit implements the api/v1/status/tsdb API in the Receiver.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add docs and todo

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix tests

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Receive: option to extract tenant from client certificate (#5153)

* added option to extract tenant from client certificate

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* added suggestions from PR

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* removed else cases

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* corrected location of certificate field check

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* fixed issue with err definition

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* updated docs

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

* corrected comment

Signed-off-by: Magnus Kaiser <magnus.kaiser@gec.io>

Co-authored-by: Magnus Kaiser <magnus.kaiser@gec.io>

* Improve ketama hashring replication (#5427)

With the Ketama hashring, replication is currently handled by choosing
subsequent nodes in the list of endpoints. This can lead to existing nodes
getting more series when the hashring is scaled.

This commit changes replication to choose subsequent nodes from the hashring
which should not create new series in old nodes when the hashring is scaled.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut readme for 0.27 (#5429)

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Added alert compliance test for Thanos (#5315)

* test: Added Alert compatibilty test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Tmp.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Refactored service helpers for newest e2e version.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed alert combatibiltiy test for now.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e: Added test for compatibility.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added Querier /alerts API.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* e2e:Added replica labels.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Option to remove replica-label.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* skip.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Use stateful ruler and default resend delay

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update docs

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Matej Gera <matejgera@gmail.com>

* 0.27-rc0 Update readme and version (#5430)

* Update readme and version

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Fix newlines

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Fixes typo

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* fixes noise

Signed-off-by: Wiard van Rij <wvanrij@roku.com>

* Alert Compliance: Fix wrong ruler configuration (#5433)

* [receive] Export metrics about remote write requests per tenant (#5424)

* Add write metrics to Thanos Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Let the middleware count inflight HTTP requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Receive write metrics type & definition

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Put option back in its place to avoid big diff

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fetch tenant from headers instead of context

It might not be in the context in some cases.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Delete unnecessary tenant parser middleware

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor & reuse code for HTTP instrumentation

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add missing copyright to some files

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add changelog entry for Receive & new HTTP metrics

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove TODO added by accident

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Make error handling code shorter

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Make switch statement simpler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove method label from timeseries' metrics

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Count samples of all series instead of each

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove in-flight requests metric

Will add this in a follow-up PR to keep this small.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Change timeseries/samples metrics to histograms

The buckets were picked based on the fact that Prometheus' default
remote write configuration
(see https://prometheus.io/docs/practices/remote_write/#memory-usage)
set a max of 500 samples sent per second.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix Prometheus registry for histograms

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix comment in NewHandler functions

There are now four metrics instead of five.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* remove unused block-sync-concurrency flag (#5426)

* remove unused block-sync-concurrency flag

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* add changelog

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* update

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix e2e test

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix tests

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* fix docs typo in metric thanos_compact_halted (#5448)

Signed-off-by: Nikita Matveenko <nikitapecasa@gmail.com>

* Implement tenant expiration (#5420)

* Implement tenant expiration

This commit adds dynamic TSDB pruning for tenants which have not
received new samples within a certain period of time.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add link to receiver tenant-lifecycle-management

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Docs: Remove Katacoda links (#5454)

* Remove Katacoda links

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Remove one more reference

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fixed lint on Go 1.18.3+ (#5459)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Add HTTP metrics for in-flight requests (#5440)

* Add HTTP metrics for in-flight requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix changelog entry after PR creation

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix link in old CHANGELOG entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix style in the CHANGELOG

All the entries should end up with a period.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve help for in-flight htttp requests metric

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Move changelog entry pending PR

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add a method label to the in-flight HTTP requests

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* docs: Fix heading level of "Excluded from caching" (#5455)

* Refactor DefaultTransport() from objstore to package exthttp (#5447)

* Refactoring the DefaultTransport func in package exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Refactoring the DefaultTransport func from s3 in package exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Updated helpers.go

corrected argument for DefaultTransport() in helpers.go

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Changed the argument type in getContainerURL

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Update pkg/exthttp/transport.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Update pkg/exthttp/transport.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Srushti (sroo-sh-tee) <73685894+SrushtiSapkale@users.noreply.github.com>

* Removed the use of NewTransport() in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Moved TLSConfig struct and functions that need it from objstore to exthttp

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed s3.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Kept s3.go and helpers.go unchanged to not break the cortex deps

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Consistency changed made while pair++ programming.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Created a new tlsconfig in exthttp and minor changes in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Commented in s3.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Minor changes in transport.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed transport.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Changed transport.go and tlsconfig.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Removed changes from prometheus.mod and prometheus.sum

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

* Minor updation in cos.go

Signed-off-by: Srushti Sapkale <srushtiisapkale@gmail.com>

Co-authored-by: bwplotka <bwplotka@gmail.com>

* receive: Fix race condition when pruning tenants (#5460)

Pruning Receiver tenants has a race condition caused by concurrently
removing items from the tenants map.

This commit addresses the issue by using a mutex to guard the tenants map.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Adding SCMP as an adopter (#5466)

Signed-off-by: Chris Ng <2509212+chris-ng-scmp@users.noreply.github.com>

* Updated busybox version. (#5471)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Refactor endpoint ref clients

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fix E2E test env name clash

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Build with Go 1.18 (#5258)

* Build with Go 1.18

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Try something

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Upgrade minio

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Replace json-iterator/reflect2 in bingo

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Ignore 405 errors for prometheus buildVersion API requests (#5477)

Older versions of prometheus (such as 2.7 which is shipped by Debian
buster) return a 405 error for non-existent API endpoints instead of the
404 returned by more recent versions.

Signed-off-by: Nicolas Dandrimont <olasd@softwareheritage.org>

* *: Cut 0.27.0 (#5473)

* Cut 0.27.0

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Updated busybox version. (#5471)

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>

* Docs: Remove Katacoda links (#5454)

* Remove Katacoda links

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Remove one more reference

Signed-off-by: Matej Gera <matejgera@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update compact.md (#5465)

* During 1h downsampling skip XOR chunks that may erroneously be present in 5m resolution blocks (#5453)

* Add fpetkovski to triage list

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use Azure BlobURL.Download instead of in-memory buffer (#5451)

Modify the azure.Bucket get methods to use BlobURL.Download for fetching
blobs and blob ranges. This avoids the need to allocate a buffer for storing
the entire expected size of the object in memory. Instead, use a ReaderCloser
view of the body returned by the download method.

See grafana/mimir#2229

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update storage.md (#5486)

* [receive] Add per-tenant charts to Receive's example dashboard  (#5472)

* Start to add tenant charts to Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Properly filter HTTP status codes

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix tenant error rate chart

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor to improve readability and consistency

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor one more usage of code and tenant labels

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Filter tenant metrics to the Receive handler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Format math expression properly

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update CHANGELOG

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add samples charts to series & samples row

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Bump Go version in all the GH Actions (#5487)

* Bump go version in go mod

This is a follow up to #5258, which made the project be built with Go 1.18.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Go version in all GH Actions

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Run go mod tidy

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Added changelog entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Put back Go 1.17 in go.mod

Because we don't use any Go 1.18 feature yet, so it's not needed

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update go.sum after changing go.mod to go 1.17

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove non-user-impacting entry for changelog

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* objstore: Download and Upload block files in parallel (#5475)

* Parallel Chunks

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* test

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Changelog

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* making ApplyDownloadOptions private

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* upload concurrency

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Upload Test

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* update change log

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Change comments

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Address comments

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Remove duplicate entries on changelog

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Addressing Comments

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* update golang.org/x/sync

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>

* Adding Commentts

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Use default HTTP config for E2E S3 tests (#5483)

Signed-off-by: Matej Gera <matejgera@gmail.com>

* chore: Included githubactions in the dependabot config (#5364)

This should help with keeping the GitHub actions updated on new releases. This will also help with keeping it secure.

Dependabot helps in keeping the supply chain secure https://docs.github.com/en/code-security/dependabot

GitHub actions up to date https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot

https://github.com/ossf/scorecard/blob/main/docs/checks.md#dependency-update-tool
Signed-off-by: naveensrinivasan <172697+naveensrinivasan@users.noreply.github.com>

* bump codemirror and promql editor to the last version (#5491)

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* receiver: Expose stats for all tenants (#5470)

* receiver: Expose stats for all tenants

Thanos Receiver supports the Prometheus tsdb status API and can expose
TSDB stats for a single tenant.

This commit extends that functionality and allows users to request
TSDB stats for all tenants using the all_tenants=true query parameter.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add back chunk count

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Simplify TSDBStats interface

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Return empty result for no stats

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* CHANGELOG.md: regenerate (#5495)

* receive: Fix stats nil pointer panic (#5494)

When fetching TSDB stats from receivers, certain TSDBs might not be
initialized yet. This can lead to a nil pointer access when the
status endpoint is accessed before all TSDBs are initialized.

This commit adds an explicit check for each tenant's TSDB when
exporting TSDB stats.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update query.md (#5496)

Fix typo of parameter --store.sd-files

Signed-off-by: Firxiao <Firxiao@users.noreply.github.com>

* Parallel download blocks - Follow up of #5475 (#5493)

* Download blocks in parallel

Signed-off-by: Alan Protasio <approtas@amazon.com>

* remove the go func

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Doc

Signed-off-by: Alan Protasio <approtas@amazon.com>

* CHANGELOG

Signed-off-by: Alan Protasio <approtas@amazon.com>

* doc

Signed-off-by: alanprot <alanprot@gmail.com>

* AddressComments

Signed-off-by: alanprot <alanprot@gmail.com>

* fix typo

Signed-off-by: Alan Protasio <approtas@amazon.com>

* Upgrade mdox with cache and some http settings to reduce CI failures (#5500)

* Pin mdox to latest master commit

It suppors now a cache for link validation and some HTTP
configuration that can be used to help avoid intermittent
CI failures.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add mdox cache and HTTP configuration

The cache has a default TTL (5 days)

A timeout of 1m and 10 connections per host at transport
level should help us reduce the intermittent failures if
we have to invalidate the cache.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add Github Action cache for the mdox cache

Using the hash of the md files as cache key.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Upgrade cache actions to v3 and add restore key

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Empty commit to test CI build cache

Signed-off-by: GitHub <noreply@github.com>

* Use 2.5 days as jitter for mdox cache

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix bad editor auto-formating again

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Updated minio-go to latest; removed fork. (#5474)

* Updated minio-go fork to latest.

NOTE: Optimization is propopsed to upstream to avoid fork in future.

Relates to https://github.com/thanos-io/thanos/issues/5101 and https://github.com/thanos-io/thanos/issues/5130

Signed-off-by: bwplotka <bwplotka@gmail.com>

# Conflicts:
#	go.mod
#	go.sum

* Removed fork.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Added comment.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Receiver: Handle storage exemplar multi-error (#5502)

* Handle exemplar store errors as conflict

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Adjust tests

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Fixing Race condition Introduced by #5493  (#5503)

* Update busybox image versions (#5506)

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Updates busybox SHA (#5507)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: yeya24 <yeya24@users.noreply.github.com>

* chore: Update Prometheus dependency (#5484)

* chore: Update Prometheus dependency

Update Prometheus from v2.33.5 to v2.36.2.

Signed-off-by: SuperQ <superq@gmail.com>

* Update query tests for cortex changes.

Signed-off-by: SuperQ <superq@gmail.com>

* Use the default rules.RuleGroupPostProcessFunc.

Signed-off-by: SuperQ <superq@gmail.com>

* Update QueryStats use.

Signed-off-by: SuperQ <superq@gmail.com>

* Update Cortex.

Signed-off-by: SuperQ <superq@gmail.com>

* Update queryfrontend for Cortex changes.

Signed-off-by: SuperQ <superq@gmail.com>

* Bump pprof.

Signed-off-by: SuperQ <superq@gmail.com>

* Add changelog entry.

Signed-off-by: SuperQ <superq@gmail.com>

* Adapt to changed query stats API

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Sync dependencies

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Reflect changed metric names

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Kemal Akkoyun <kakkoyun@gmail.com>
Co-authored-by: Kemal Akkoyun <kakkoyun@users.noreply.github.com>

* chore: Vendor Cortex dependency as an internal package (#5504)

* Vendor Cortex dependency as an internal package

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add gitattributes

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Skip checks for vendored directory

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add copyright headers for Cortex

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* *: Move objstore out of repo (#5510)

* *: Move objstore out of repo

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix doc checks

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* chore: Update Prometheus to v2.37.0 (#5511)

* chore: Update Prometheus to v2.37.0

Update Prometheus to the latest release. Note that Prometheus
upstream now tags v0.x.y to map to the 2.x.y releases.

Signed-off-by: SuperQ <superq@gmail.com>

* Cleanup direct/indirect go.mod requirements.

Signed-off-by: SuperQ <superq@gmail.com>

* chore: Update Go modules (#5516)

* Update weaveworks/common to remove node_exporter indirect dep.
* Update simonpasquier/klog-gokit/v2.
* Update google.golang.org/grpc lock to v1.45.0.
* Cleanup replacements that are now handled by indirect requirements.
* Fixup grpc.WithInsecure() use.

Signed-off-by: SuperQ <superq@gmail.com>

* chore: Update Go modules (#5518)

* Reuse upstream TSDB status structs (#5526)

This commit replaces the copied TSDB status structs with direct
references from prometheus/prometheus.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix proposal on website (#5530)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update all bingo dependencies (#5525)

This commit updates all bingo dependencies to their latest versions.

It pins golang.org/x/sys to v0.0.0-20220715151400-c0bba94af5f8 for
the github.com/google/go-jsonnet dependency in order to prevent
failures when running make docs on Mac OS.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* delete_katacoda (#5529)

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Remove empty RuleGroups in api/v1/rules when using matchers (#5537)

* Remove empty RuleGroups in api/v1/rules

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestion

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Rename variables

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* fix(api): When querying api query on endpoint alerts return a json struct with alerts in lowercase. (#5534)

To be same result as prometheus api
Signed-off-by: Guillaume audic <audic.gui@gmail.com>

* Receiver: Add benchmark for receive writer (#5533)

* Add benchmark for receive writer

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Incorporate feedback

- Clearer parameter naming; use a separate temp dir for bench

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Submit a proposal for Active Series Limiting for Hashring Topology (#5415)

* Add proposal for Active Series Limiting for Hashring Topology

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Resize images

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add Observatorium as an alternative

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions; add TODO

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update proposal

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions: add sections numbers

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Refactor EndpointSet (#5538)

* Refactor EndpointSet

This commit refactors the EndpointSet struct in order to make it easier
to understand and work with.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Handle context cancellation in endpoint mock

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Make additions and removals of refs atomic.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix changed-docs grep regex (#5556)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Added Vertical Query Sharding to Query-Frontend (#5342)

* Update faillint to v1.10.0

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Implement query sharding

This commit implements query sharding for grouping PromQL expressions.

Sharding is initiated by analyzing the PromQL and extracting
grouping labels. Extracted labels are propagated down to Stores which
partition the response by hashmoding all series on those labels.

If a query is shardable, the partitioning and merging process will be
initiated by the Query Frontend. The Query Frontend will make N distinct
queries across a set of Queriers and merge the results back before
presenting them to the user.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* First code review pass

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use sync pool to reuse sharding buffers

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add test for binary expression with constant

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Include external labels in series sharding

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Rule: Fix e2e test flake (#5558)

* Rule: Fix e2e test flake

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix lint

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Check errors

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Change to github.com/thanos-io/thanos/pkg/errors

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestion

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix multi-tenant exemplar matchers (#5554)

* Fix multi-tenant exemplar matchers

The exemplar proxy synthesizes a query based on PromQL expression matchers
and individual store's label sets. When a store has multiple label sets
with same label names but different values (e.g. multitenant Receivers),
each exemplar matcher will be repeated once for each label set. Because of this,
a receiver hosting 200 tenants can get the same exemplar matcher 200 times. This leads
to the underlying stores slowing down and timing out when asked for exemplars.

This commit modifies the exemplar proxy to deduplicate matchers and only send
a matcher once to an underlying store.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Address CR comments

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Receive: add per request limits for remote write (#5527)

* Add per request limits for remote write

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Remove useless TODO item

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Refactor write request limits test

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add write concurrency limit to Receive

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Change write limits config option name

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Document remote write concurrenty limit

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add changelog entry

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Format docs

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Extract request limiting logic from handler

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add copyright header

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add a TODO for per-tenant limits

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add default value and hide the request limit flags

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve TODO comment in request limits

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Update Receive docs after flags wre made hidden

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Add note about WIP in Receive request limits doc

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix typo in Receive docs

Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix help text for concurrent request limit

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Use byte unit helpers for improved readability

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Removed check for nil writeGate

The constructor sets the writeGate to a noopGate.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Better organize linebreaks

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix help text for limits hit metric

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Apply some english feedback

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Improve limits & gates documentationb

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix import clause

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Use a 3 node hashring for write limits test

This should ensure the request fanout logic cannot somehow interfere
with the request limit logic.

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Fix comment

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Signed-off-by: Douglas Camata <159076+douglascamata@users.noreply.github.com>

* Announce sharding in ruler and store proxy (#5560)

The ruler and store proxy currently support series sharding
through the components that they use. However, this capability is not
announced to the querier.

This commit modifies their Info calls to indicate to the querier
that it doesn't need to shard the response it receives from rulers
and other store proxies.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix flaky e2e tests (#5563)

* Tools: Fix e2e test flake

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Metadata: Fix flaky e2e test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Compact: Fix flaky e2e test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Bumping actions/cache to v3 for e2e tests

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add missing e2e.WaitMissingMetrics

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Meta-monitoring based active series limiting (#5520)

* Add initial PoC for meta-monitoring Receive active series limits

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add e2e tests, rebase

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add multitenant test + remake diagrams

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions; Make naming consistent; Rm/Add metrics

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Reuse meta-monitoring client

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix panic

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cache meta-monitoring query result

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix lint

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fail fast when limiting

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Implement suggestions: docs + mutex + struct

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add interface and no-op

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add changelog entry

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add seriesLimitSupported to handler

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Remove tools fork

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Change docs header

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Remove usage of ioutil (#5564)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* docs/contribution.md: Update required Go version  (#5557)

* delete_katacoda

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* updated go version

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* update golang version

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* updated

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Retrigger CI

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* Retrigger CI

Signed-off-by: Akshit42-hue <patelakshit2025@gmail.com>

* fix an expression param in a link to an alert in the rules page (#5562)

Signed-off-by: Rostislav Benes <r.dee.b.b@gmail.com>

Co-authored-by: Rostislav Benes <r.dee.b.b@gmail.com>

* Receiver: Validate labels in write requests (#5508)

* Add label set validation method

Signed-off-by: Matej Gera <matejgera@g…
@saswatamcode saswatamcode mentioned this pull request Jun 26, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find replacement for https://github.com/pkg/errors
6 participants