Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/tcpcheck] New receiver #34458

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yanfeng1992
Copy link

@yanfeng1992 yanfeng1992 commented Aug 7, 2024

Signed-off-by: huangyanfeng huangyanfeng1992@gmail.com

Description:

To monitor the availability and performance of TCP endpoints.
Use case: TCP network detection plug-in, usually used to monitor whether a TCP port on the local machine is listening, or whether a remote port can be connected

Link to tracking Issue:

#34414

Testing:
(1) PR Includes coverage ~85% in the package with the scraper.
(2) Also includes coverage ~80% in the internal/tcpconfig package

Documentation:

The new receiver includes documentation in receiver/tcpcheckreceiver/README.md and receiver/tcpcheckreceiver/documentation.md.

The metric output looks like the following ...

Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/tcpcheckreceiver 1.0.0
Metric #0
Descriptor:
     -> Name: tcpcheck.duration
     -> Description: Measures the duration of TCP connection.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> tcp.endpoint: Str(10.255.88.232:8088)
StartTimestamp: 2024-08-08 08:45:04.75587161 +0000 UTC
Timestamp: 2024-08-08 08:45:05.758205866 +0000 UTC
Value: 1
Metric #1
Descriptor:
     -> Name: tcpcheck.status
     -> Description: 1 if the TCP client successfully connected, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> tcp.endpoint: Str(10.255.88.232:8088)
StartTimestamp: 2024-08-08 08:45:04.75587161 +0000 UTC
Timestamp: 2024-08-08 08:45:05.758205866 +0000 UTC
Value: 1

Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/tcpcheckreceiver 1.0.0
Metric #0
Descriptor:
     -> Name: tcpcheck.duration
     -> Description: Measures the duration of TCP connection.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> tcp.endpoint: Str(127.0.0.1:80)
StartTimestamp: 2024-08-08 08:42:18.432529413 +0000 UTC
Timestamp: 2024-08-08 08:42:19.437666882 +0000 UTC
Value: 0
Metric #1
Descriptor:
     -> Name: tcpcheck.error
     -> Description: Records errors occurring during TCP check.
     -> Unit: {error}
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> error.message: Str(dial tcp 127.0.0.1:80: connect: connection refused)
StartTimestamp: 2024-08-08 08:42:18.432529413 +0000 UTC
Timestamp: 2024-08-08 08:42:19.437666882 +0000 UTC
Value: 1
Metric #2
Descriptor:
     -> Name: tcpcheck.status
     -> Description: 1 if the TCP client successfully connected, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> tcp.endpoint: Str(127.0.0.1:80)
StartTimestamp: 2024-08-08 08:42:18.432529413 +0000 UTC
Timestamp: 2024-08-08 08:42:19.437666882 +0000 UTC
Value: 0

mx-psi
mx-psi previously requested changes Aug 7, 2024
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your contribution!

You will need a sponsor in order for us to accept this contribution. Please take a look at the instructions on adding new components. Thanks!

@yanfeng1992 yanfeng1992 force-pushed the tcp-checker branch 3 times, most recently from d53f92f to 5d5cdf3 Compare August 8, 2024 08:40
@yanfeng1992
Copy link
Author

yanfeng1992 commented Aug 8, 2024

Would you be willing to be my sponsor for this contribution?
Please take a look at the issue: #34414
@mx-psi

@yanfeng1992 yanfeng1992 requested a review from mx-psi August 8, 2024 09:19
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@crobert-1
Copy link
Member

Hello @yanfeng1992, thanks for opening this PR! I'm going to close for now as this component will require a sponsor before it's able to move forward. Once a sponsor is found you're welcome to reopen 👍

@crobert-1 crobert-1 closed this Sep 12, 2024
@yanfeng1992
Copy link
Author

yanfeng1992 commented Sep 13, 2024

Hello @yanfeng1992, thanks for opening this PR! I'm going to close for now as this component will require a sponsor before it's able to move forward. Once a sponsor is found you're welcome to reopen 👍

@crobert-1 Would you be willing to be my sponsor for this contribution?

I saw that other people also need this receiver

#33718

And Fluent Bit also has a tcp health check plug-in
https://docs.fluentbit.io/manual/pipeline/inputs/health

@yanfeng1992
Copy link
Author

Can you review the code again? Thanks @atoulme

@yanfeng1992 yanfeng1992 force-pushed the tcp-checker branch 3 times, most recently from acf0690 to 875ec5d Compare October 18, 2024 02:45
if err == nil {
success = 1
}
s.mb.RecordTcpcheckDurationDataPoint(now, time.Since(start).Nanoseconds(), s.Config.TCPClientSettings.Endpoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we record duration in ms instead of nanoseconds?

enabled: true
gauge:
value_type: int
unit: ns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ms instead of ns?


```yaml
receivers:
sshcheck:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, also can you make this support multiple endpoints instead of just one as specified currently?

}

if err = s.scrapeTCP(now); err != nil {
s.mb.RecordTcpcheckErrorDataPoint(now, int64(1), s.Endpoint, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implementation is not complete. I don't see where s.Endpoint comes from, I would expect the scrape to loop over s.Config.Targets or something.

@michael-burt
Copy link
Contributor

Instead of using sshcheckreceiver as a reference, I would recommend looking at httpcheckreceiver instead. This component should support multiple TCP endpoint targets. It should also use a mutex for the metric update while looping over the targets as a go routine so that you can parallelize the probes.

Copy link
Contributor

github-actions bot commented Nov 2, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Nov 2, 2024
@atoulme atoulme dismissed mx-psi’s stale review November 3, 2024 17:44

sponsor accepted

@atoulme atoulme removed the Stale label Nov 3, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Introduce a new receiver, TCP Check Receiver.This receiver creates stats by connecting to an TCP server.

Related to open-telemetry#34414

N/A

metadata.yaml, README.

Signed-off-by: huangyanfeng <huangyanfeng1992@gmail.com>
@ZenoCC-Peng
Copy link
Contributor

Hi @yanfeng1992,
Could I have your approval to start working on these refinements? I’d like to enhance the functionality to satisfy @michael-burt the requirements for the TCP checker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants