feat: use cached results from GCS in TA GQL #952

joseph-sentry · 2024-11-01T14:34:40Z

we want the GQL endpoints to consume the cached query results from GCS
and then do some filtering and ordering on top of that, and also do some
extra caching of those results in redis.

this commit also contains a bunch of reorganization and refactoring of
the GQL code

codecov-staging · 2024-11-01T14:48:51Z

Codecov Report

Attention: Patch coverage is 78.57143% with 54 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
graphql_api/types/test_analytics/test_analytics.py	78.41%	30 Missing ⚠️
...test_results_aggregates/test_results_aggregates.py	70.45%	13 Missing ⚠️
...hql_api/types/flake_aggregates/flake_aggregates.py	72.97%	10 Missing ⚠️
services/task/task.py	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

codecov · 2024-11-01T14:48:53Z

Codecov Report

Attention: Patch coverage is 78.57143% with 54 lines in your changes missing coverage. Please review.

Project coverage is 96.04%. Comparing base (72f3c2f) to head (b2ff349).
Report is 4 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
graphql_api/types/test_analytics/test_analytics.py	78.41%	30 Missing ⚠️
...test_results_aggregates/test_results_aggregates.py	70.45%	13 Missing ⚠️
...hql_api/types/flake_aggregates/flake_aggregates.py	72.97%	10 Missing ⚠️
services/task/task.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #952      +/-   ##
==========================================
- Coverage   96.25%   96.04%   -0.22%     
==========================================
  Files         827      827              
  Lines       19090    19068      -22     
==========================================
- Hits        18376    18313      -63     
- Misses        714      755      +41

Flag	Coverage Δ
unit	`92.28% <78.57%> (-0.25%)`	⬇️
unit-latest-uploader	`92.28% <78.57%> (-0.25%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codecov-qa · 2024-11-01T14:48:58Z

❌ 5 Tests Failed:

Tests completed	Failed	Passed	Skipped
2600	5	2595	6

View the top 3 failed tests by shortest run time

graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results_no_storage

Stack Traces | 6.36s run time

self = &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
&gt;           conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )

.../local/lib/python3.12............/site-packages/urllib3/connection.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12.../urllib3/util/connection.py:72: in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

host = 'minio', port = 9000, family = &lt;AddressFamily.AF_UNSPEC: 0&gt;
type = &lt;SocketKind.SOCK_STREAM: 1&gt;, proto = 0, flags = 0

    def getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
        """Resolve host and port into list of address info entries.
    
        Translate the host/port argument into a sequence of 5-tuples that contain
        all the necessary arguments for creating a socket connected to that service.
        host is a domain name, a string representation of an IPv4/v6 address or
        None. port is a string service name such as 'http', a numeric port number or
        None. By passing None as the value of host and port, you can pass NULL to
        the underlying C API.
    
        The family, type and proto arguments can be optionally specified in order to
        narrow the list of addresses returned. Passing zero as a value for each of
        these arguments selects the full range of results.
        """
        # We override this function since we want to translate the numeric family
        # and socket type values to enum constants.
        addrlist = []
&gt;       for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
E       socket.gaierror: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12/socket.py:976: gaierror

During handling of the above exception, another exception occurred:

self = &lt;urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50&gt;
method = 'GET', url = '/codecov?location=', body = None
headers = HTTPHeaderDict({'Host': 'minio:9000', 'User-Agent': 'MinIO (Linux; x86_64) minio-py/7.1.13', 'x-amz-content-sha256': '...ers=host;x-amz-content-sha256;x-amz-date, Signature=05af952d2846b4ed5e339dcca185860b954d6bfa77fe8a375b3c6f7638bc6d11'})
retries = Retry(total=0, connect=None, read=None, redirect=None, status=None)
redirect = False, assert_same_host = False
timeout = &lt;object object at 0x7fcc0a922fe0&gt;, pool_timeout = None
release_conn = True, chunked = False, body_pos = None
response_kw = {'preload_content': True}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/codecov', query='location=', fragment=None)
destination_scheme = None, conn = None, release_this_conn = True
http_tunnel_required = False, err = None, clean_exit = False

    def urlopen(
        self,
        method,
        url,
        body=None,
        headers=None,
        retries=None,
        redirect=True,
        assert_same_host=True,
        timeout=_Default,
        pool_timeout=None,
        release_conn=None,
        chunked=False,
        body_pos=None,
        **response_kw
    ):
        """
        Get a connection from the pool and perform an HTTP request. This is the
        lowest level call for making a request, so you'll need to specify all
        the raw details.
    
        .. note::
    
           More commonly, it's appropriate to use a convenience method provided
           by :class:`.RequestMethods`, such as :meth:`request`.
    
        .. note::
    
           `release_conn` will only behave as expected if
           `preload_content=False` because we want to make
           `preload_content=False` the default behaviour someday soon without
           breaking backwards compatibility.
    
        :param method:
            HTTP request method (such as GET, POST, PUT, etc.)
    
        :param url:
            The URL to perform the request on.
    
        :param body:
            Data to send in the request body, either :class:`str`, :class:`bytes`,
            an iterable of :class:`str`/:class:`bytes`, or a file-like object.
    
        :param headers:
            Dictionary of custom headers to send, such as User-Agent,
            If-None-Match, etc. If None, pool headers are used. If provided,
            these headers completely replace any pool-specific headers.
    
        :param retries:
            Configure the number of retries to allow before raising a
            :class:`~urllib3.exceptions.MaxRetryError` exception.
    
            Pass ``None`` to retry until you receive a response. Pass a
            :class:`~urllib3.util.retry.Retry` object for fine-grained control
            over different types of retries.
            Pass an integer number to retry connection errors that many times,
            but no other types of errors. Pass zero to never retry.
    
            If ``False``, then retries are disabled and any exception is raised
            immediately. Also, instead of raising a MaxRetryError on redirects,
            the redirect response will be returned.
    
        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
    
        :param redirect:
            If True, automatically handle redirects (status codes 301, 302,
            303, 307, 308). Each redirect counts as a retry. Disabling retries
            will disable redirect, too.
    
        :param assert_same_host:
            If ``True``, will make sure that the host of the pool requests is
            consistent else will raise HostChangedError. When ``False``, you can
            use the pool on an HTTP proxy and request foreign hosts.
    
        :param timeout:
            If specified, overrides the default timeout for this one
            request. It may be a float (in seconds) or an instance of
            :class:`urllib3.util.Timeout`.
    
        :param pool_timeout:
            If set and the pool is set to block=True, then this method will
            block for ``pool_timeout`` seconds and raise EmptyPoolError if no
            connection is available within the time period.
    
        :param release_conn:
            If False, then the urlopen call will not release the connection
            back into the pool once a response is received (but will release if
            you read the entire contents of the response such as when
            `preload_content=True`). This is useful if you're not preloading
            the response's content immediately. You will need to call
            ``r.release_conn()`` on the response ``r`` to return the connection
            back into the pool. If None, it takes the value of
            ``response_kw.get('preload_content', True)``.
    
        :param chunked:
            If True, urllib3 will send the body using chunked transfer
            encoding. Otherwise, urllib3 will send the body using the standard
            content-length form. Defaults to False.
    
        :param int body_pos:
            Position to seek to in file-like body in the event of a retry or
            redirect. Typically this won't need to be set because urllib3 will
            auto-populate the value when needed.
    
        :param \\**response_kw:
            Additional parameters are passed to
            :meth:`urllib3.response.HTTPResponse.from_httplib`
        """
    
        parsed_url = parse_url(url)
        destination_scheme = parsed_url.scheme
    
        if headers is None:
            headers = self.headers
    
        if not isinstance(retries, Retry):
            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
    
        if release_conn is None:
            release_conn = response_kw.get("preload_content", True)
    
        # Check host
        if assert_same_host and not self.is_same_host(url):
            raise HostChangedError(self, url, retries)
    
        # Ensure that the URL we're connecting to is properly encoded
        if url.startswith("/"):
            url = six.ensure_str(_encode_target(url))
        else:
            url = six.ensure_str(parsed_url.url)
    
        conn = None
    
        # Track whether `conn` needs to be released before
        # returning/raising/recursing. Update this variable if necessary, and
        # leave `release_conn` constant throughout the function. That way, if
        # the function recurses, the original value of `release_conn` will be
        # passed down into the recursive call, and its value will be respected.
        #
        # See issue #651 [1] for details.
        #
        # [1] &lt;https://github..../urllib3/issues/651&gt;
        release_this_conn = release_conn
    
        http_tunnel_required = connection_requires_http_tunnel(
            self.proxy, self.proxy_config, destination_scheme
        )
    
        # Merge the proxy headers. Only done when not using HTTP CONNECT. We
        # have to copy the headers dict so we can safely change it without those
        # changes being reflected in anyone else's copy.
        if not http_tunnel_required:
            headers = headers.copy()
            headers.update(self.proxy_headers)
    
        # Must keep the exception bound to a separate variable or else Python 3
        # complains about UnboundLocalError.
        err = None
    
        # Keep track of whether we cleanly exited the except block. This
        # ensures we do proper cleanup in finally.
        clean_exit = False
    
        # Rewind body position, if needed. Record current position
        # for future rewinds in the event of a redirect/retry.
        body_pos = set_file_position(body, body_pos)
    
        try:
            # Request a connection from the queue.
            timeout_obj = self._get_timeout(timeout)
            conn = self._get_conn(timeout=pool_timeout)
    
            conn.timeout = timeout_obj.connect_timeout
    
            is_new_proxy_conn = self.proxy is not None and not getattr(
                conn, "sock", None
            )
            if is_new_proxy_conn and http_tunnel_required:
                self._prepare_proxy(conn)
    
            # Make the request on the httplib connection object.
&gt;           httplib_response = self._make_request(
                conn,
                method,
                url,
                timeout=timeout_obj,
                body=body,
                headers=headers,
                chunked=chunked,
            )

.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:715: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:416: in _make_request
    conn.request(method, url, **httplib_request_kw)
.../local/lib/python3.12............/site-packages/urllib3/connection.py:244: in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
.../local/lib/python3.12/http/client.py:1336: in request
    self._send_request(method, url, body, headers, encode_chunked)
.../local/lib/python3.12/http/client.py:1382: in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1331: in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1091: in _send_output
    self.send(msg)
.../local/lib/python3.12/http/client.py:1035: in send
    self.connect()
.../local/lib/python3.12............/site-packages/urllib3/connection.py:205: in connect
    conn = self._new_conn()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
            conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )
    
        except SocketTimeout:
            raise ConnectTimeoutError(
                self,
                "Connection to %s timed out. (connect timeout=%s)"
                % (self.host, self.timeout),
            )
    
        except SocketError as e:
&gt;           raise NewConnectionError(
                self, "Failed to establish a new connection: %s" % e
            )
E           urllib3.exceptions.NewConnectionError: &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12............/site-packages/urllib3/connection.py:186: NewConnectionError

During handling of the above exception, another exception occurred:

self = &lt;graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf110e2a0&gt;
transactional_db = None
repository = &lt;Repository: Repo&lt;Owner&lt;github/codecov-user&gt;/testRepoName&gt;&gt;

    def test_get_test_results_no_storage(self, transactional_db, repository):
        with pytest.raises(FileNotFoundError):
&gt;           get_results(repository.repoid, repository.branch, 30)

graphql_api/tests/test_test_analytics.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
utils/test_results.py:50: in get_results
    serialized_table = storage_service.read_file(
.../local/lib/python3.12.../shared/storage/minio.py:200: in read_file
    res = self.minio_client.get_object(bucket_name, path)
.../local/lib/python3.12............/site-packages/minio/api.py:1159: in get_object
    return self._execute(
.../local/lib/python3.12............/site-packages/minio/api.py:400: in _execute
    region = self._get_region(bucket_name, None)
.../local/lib/python3.12............/site-packages/minio/api.py:467: in _get_region
    response = self._url_open(
.../local/lib/python3.12............/site-packages/minio/api.py:266: in _url_open
    response = self._http.urlopen(
.../local/lib/python3.12.../site-packages/urllib3/poolmanager.py:376: in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:801: in urlopen
    retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Retry(total=0, connect=None, read=None, redirect=None, status=None)
method = 'GET', url = '/codecov?location=', response = None
error = NewConnectionError('&lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')
_pool = &lt;urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50&gt;
_stacktrace = &lt;traceback object at 0x7fbd1f56bd40&gt;

    def increment(
        self,
        method=None,
        url=None,
        response=None,
        error=None,
        _pool=None,
        _stacktrace=None,
    ):
        """Return a new Retry object with incremented retry counters.
    
        :param response: A response object, or None, if the server did not
            return a response.
        :type response: :class:`~urllib3.response.HTTPResponse`
        :param Exception error: An error encountered during the request, or
            None if the response was received successfully.
    
        :return: A new ``Retry`` object.
        """
        if self.total is False and error:
            # Disabled, indicate to re-raise the error.
            raise six.reraise(type(error), error, _stacktrace)
    
        total = self.total
        if total is not None:
            total -= 1
    
        connect = self.connect
        read = self.read
        redirect = self.redirect
        status_count = self.status
        other = self.other
        cause = "unknown"
        status = None
        redirect_location = None
    
        if error and self._is_connection_error(error):
            # Connect retry?
            if connect is False:
                raise six.reraise(type(error), error, _stacktrace)
            elif connect is not None:
                connect -= 1
    
        elif error and self._is_read_error(error):
            # Read retry?
            if read is False or not self._is_method_retryable(method):
                raise six.reraise(type(error), error, _stacktrace)
            elif read is not None:
                read -= 1
    
        elif error:
            # Other retry?
            if other is not None:
                other -= 1
    
        elif response and response.get_redirect_location():
            # Redirect retry?
            if redirect is not None:
                redirect -= 1
            cause = "too many redirects"
            redirect_location = response.get_redirect_location()
            status = response.status
    
        else:
            # Incrementing because of a server error like a 500 in
            # status_forcelist and the given method is in the allowed_methods
            cause = ResponseError.GENERIC_ERROR
            if response and response.status:
                if status_count is not None:
                    status_count -= 1
                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
                status = response.status
    
        history = self.history + (
            RequestHistory(method, url, error, status, redirect_location),
        )
    
        new_retry = self.new(
            total=total,
            connect=connect,
            read=read,
            redirect=redirect,
            status=status_count,
            other=other,
            history=history,
        )
    
        if new_retry.is_exhausted():
&gt;           raise MaxRetryError(_pool, url, error or ResponseError(cause))
E           urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='minio', port=9000): Max retries exceeded with url: /codecov?location= (Caused by NewConnectionError('&lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

.../local/lib/python3.12.../urllib3/util/retry.py:594: MaxRetryError

graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results

Stack Traces | 6.37s run time

No failure message available

graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results_no_redis

Stack Traces | 6.41s run time

No failure message available

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

codecov-public-qa · 2024-11-01T14:48:59Z

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 2606 tests with 5 failed, 2595 passed and 6 skipped.

View the full list of failed tests

pytest

Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
Test name: test_get_test_results
```
No failure message available
```
Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
Test name: test_get_test_results_no_redis
```
No failure message available
```

Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
Test name: test_get_test_results_no_storage

self = <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
>           conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )

.../local/lib/python3.12............/site-packages/urllib3/connection.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12.../urllib3/util/connection.py:72: in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

host = 'minio', port = 9000, family = <AddressFamily.AF_UNSPEC: 0>
type = <SocketKind.SOCK_STREAM: 1>, proto = 0, flags = 0

    def getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
        """Resolve host and port into list of address info entries.
    
        Translate the host/port argument into a sequence of 5-tuples that contain
        all the necessary arguments for creating a socket connected to that service.
        host is a domain name, a string representation of an IPv4/v6 address or
        None. port is a string service name such as 'http', a numeric port number or
        None. By passing None as the value of host and port, you can pass NULL to
        the underlying C API.
    
        The family, type and proto arguments can be optionally specified in order to
        narrow the list of addresses returned. Passing zero as a value for each of
        these arguments selects the full range of results.
        """
        # We override this function since we want to translate the numeric family
        # and socket type values to enum constants.
        addrlist = []
>       for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
E       socket.gaierror: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12/socket.py:976: gaierror

During handling of the above exception, another exception occurred:

self = <urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50>
method = 'GET', url = '/codecov&location=', body = None
headers = HTTPHeaderDict({'Host': 'minio:9000', 'User-Agent': 'MinIO (Linux; x86_64) minio-py/7.1.13', 'x-amz-content-sha256': '...ers=host;x-amz-content-sha256;x-amz-date, Signature=05af952d2846b4ed5e339dcca185860b954d6bfa77fe8a375b3c6f7638bc6d11'})
retries = Retry(total=0, connect=None, read=None, redirect=None, status=None)
redirect = False, assert_same_host = False
timeout = <object object at 0x7fcc0a922fe0>, pool_timeout = None
release_conn = True, chunked = False, body_pos = None
response_kw = {'preload_content': True}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/codecov', query='location=', fragment=None)
destination_scheme = None, conn = None, release_this_conn = True
http_tunnel_required = False, err = None, clean_exit = False

    def urlopen(
        self,
        method,
        url,
        body=None,
        headers=None,
        retries=None,
        redirect=True,
        assert_same_host=True,
        timeout=_Default,
        pool_timeout=None,
        release_conn=None,
        chunked=False,
        body_pos=None,
        **response_kw
    ):
        """
        Get a connection from the pool and perform an HTTP request. This is the
        lowest level call for making a request, so you'll need to specify all
        the raw details.
    
        .. note::
    
           More commonly, it's appropriate to use a convenience method provided
           by :class:`.RequestMethods`, such as :meth:`request`.
    
        .. note::
    
           `release_conn` will only behave as expected if
           `preload_content=False` because we want to make
           `preload_content=False` the default behaviour someday soon without
           breaking backwards compatibility.
    
        :param method:
            HTTP request method (such as GET, POST, PUT, etc.)
    
        :param url:
            The URL to perform the request on.
    
        :param body:
            Data to send in the request body, either :class:`str`, :class:`bytes`,
            an iterable of :class:`str`/:class:`bytes`, or a file-like object.
    
        :param headers:
            Dictionary of custom headers to send, such as User-Agent,
            If-None-Match, etc. If None, pool headers are used. If provided,
            these headers completely replace any pool-specific headers.
    
        :param retries:
            Configure the number of retries to allow before raising a
            :class:`~urllib3.exceptions.MaxRetryError` exception.
    
            Pass ``None`` to retry until you receive a response. Pass a
            :class:`~urllib3.util.retry.Retry` object for fine-grained control
            over different types of retries.
            Pass an integer number to retry connection errors that many times,
            but no other types of errors. Pass zero to never retry.
    
            If ``False``, then retries are disabled and any exception is raised
            immediately. Also, instead of raising a MaxRetryError on redirects,
            the redirect response will be returned.
    
        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
    
        :param redirect:
            If True, automatically handle redirects (status codes 301, 302,
            303, 307, 308). Each redirect counts as a retry. Disabling retries
            will disable redirect, too.
    
        :param assert_same_host:
            If ``True``, will make sure that the host of the pool requests is
            consistent else will raise HostChangedError. When ``False``, you can
            use the pool on an HTTP proxy and request foreign hosts.
    
        :param timeout:
            If specified, overrides the default timeout for this one
            request. It may be a float (in seconds) or an instance of
            :class:`urllib3.util.Timeout`.
    
        :param pool_timeout:
            If set and the pool is set to block=True, then this method will
            block for ``pool_timeout`` seconds and raise EmptyPoolError if no
            connection is available within the time period.
    
        :param release_conn:
            If False, then the urlopen call will not release the connection
            back into the pool once a response is received (but will release if
            you read the entire contents of the response such as when
            `preload_content=True`). This is useful if you're not preloading
            the response's content immediately. You will need to call
            ``r.release_conn()`` on the response ``r`` to return the connection
            back into the pool. If None, it takes the value of
            ``response_kw.get('preload_content', True)``.
    
        :param chunked:
            If True, urllib3 will send the body using chunked transfer
            encoding. Otherwise, urllib3 will send the body using the standard
            content-length form. Defaults to False.
    
        :param int body_pos:
            Position to seek to in file-like body in the event of a retry or
            redirect. Typically this won't need to be set because urllib3 will
            auto-populate the value when needed.
    
        :param \\**response_kw:
            Additional parameters are passed to
            :meth:`urllib3.response.HTTPResponse.from_httplib`
        """
    
        parsed_url = parse_url(url)
        destination_scheme = parsed_url.scheme
    
        if headers is None:
            headers = self.headers
    
        if not isinstance(retries, Retry):
            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
    
        if release_conn is None:
            release_conn = response_kw.get("preload_content", True)
    
        # Check host
        if assert_same_host and not self.is_same_host(url):
            raise HostChangedError(self, url, retries)
    
        # Ensure that the URL we're connecting to is properly encoded
        if url.startswith("/"):
            url = six.ensure_str(_encode_target(url))
        else:
            url = six.ensure_str(parsed_url.url)
    
        conn = None
    
        # Track whether `conn` needs to be released before
        # returning/raising/recursing. Update this variable if necessary, and
        # leave `release_conn` constant throughout the function. That way, if
        # the function recurses, the original value of `release_conn` will be
        # passed down into the recursive call, and its value will be respected.
        #
        # See issue #651 [1] for details.
        #
        # [1] <https://github..../urllib3/issues/651>
        release_this_conn = release_conn
    
        http_tunnel_required = connection_requires_http_tunnel(
            self.proxy, self.proxy_config, destination_scheme
        )
    
        # Merge the proxy headers. Only done when not using HTTP CONNECT. We
        # have to copy the headers dict so we can safely change it without those
        # changes being reflected in anyone else's copy.
        if not http_tunnel_required:
            headers = headers.copy()
            headers.update(self.proxy_headers)
    
        # Must keep the exception bound to a separate variable or else Python 3
        # complains about UnboundLocalError.
        err = None
    
        # Keep track of whether we cleanly exited the except block. This
        # ensures we do proper cleanup in finally.
        clean_exit = False
    
        # Rewind body position, if needed. Record current position
        # for future rewinds in the event of a redirect/retry.
        body_pos = set_file_position(body, body_pos)
    
        try:
            # Request a connection from the queue.
            timeout_obj = self._get_timeout(timeout)
            conn = self._get_conn(timeout=pool_timeout)
    
            conn.timeout = timeout_obj.connect_timeout
    
            is_new_proxy_conn = self.proxy is not None and not getattr(
                conn, "sock", None
            )
            if is_new_proxy_conn and http_tunnel_required:
                self._prepare_proxy(conn)
    
            # Make the request on the httplib connection object.
>           httplib_response = self._make_request(
                conn,
                method,
                url,
                timeout=timeout_obj,
                body=body,
                headers=headers,
                chunked=chunked,
            )

.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:715: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:416: in _make_request
    conn.request(method, url, **httplib_request_kw)
.../local/lib/python3.12............/site-packages/urllib3/connection.py:244: in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
.../local/lib/python3.12/http/client.py:1336: in request
    self._send_request(method, url, body, headers, encode_chunked)
.../local/lib/python3.12/http/client.py:1382: in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1331: in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1091: in _send_output
    self.send(msg)
.../local/lib/python3.12/http/client.py:1035: in send
    self.connect()
.../local/lib/python3.12............/site-packages/urllib3/connection.py:205: in connect
    conn = self._new_conn()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
            conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )
    
        except SocketTimeout:
            raise ConnectTimeoutError(
                self,
                "Connection to %s timed out. (connect timeout=%s)"
                % (self.host, self.timeout),
            )
    
        except SocketError as e:
>           raise NewConnectionError(
                self, "Failed to establish a new connection: %s" % e
            )
E           urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12............/site-packages/urllib3/connection.py:186: NewConnectionError

During handling of the above exception, another exception occurred:

self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf110e2a0>
transactional_db = None
repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>

    def test_get_test_results_no_storage(self, transactional_db, repository):
        with pytest.raises(FileNotFoundError):
>           get_results(repository.repoid, repository.branch, 30)

graphql_api/tests/test_test_analytics.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
utils/test_results.py:50: in get_results
    serialized_table = storage_service.read_file(
.../local/lib/python3.12.../shared/storage/minio.py:200: in read_file
    res = self.minio_client.get_object(bucket_name, path)
.../local/lib/python3.12............/site-packages/minio/api.py:1159: in get_object
    return self._execute(
.../local/lib/python3.12............/site-packages/minio/api.py:400: in _execute
    region = self._get_region(bucket_name, None)
.../local/lib/python3.12............/site-packages/minio/api.py:467: in _get_region
    response = self._url_open(
.../local/lib/python3.12............/site-packages/minio/api.py:266: in _url_open
    response = self._http.urlopen(
.../local/lib/python3.12.../site-packages/urllib3/poolmanager.py:376: in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:801: in urlopen
    retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Retry(total=0, connect=None, read=None, redirect=None, status=None)
method = 'GET', url = '/codecov&location=', response = None
error = NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')
_pool = <urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50>
_stacktrace = <traceback object at 0x7fbd1f56bd40>

    def increment(
        self,
        method=None,
        url=None,
        response=None,
        error=None,
        _pool=None,
        _stacktrace=None,
    ):
        """Return a new Retry object with incremented retry counters.
    
        :param response: A response object, or None, if the server did not
            return a response.
        :type response: :class:`~urllib3.response.HTTPResponse`
        :param Exception error: An error encountered during the request, or
            None if the response was received successfully.
    
        :return: A new ``Retry`` object.
        """
        if self.total is False and error:
            # Disabled, indicate to re-raise the error.
            raise six.reraise(type(error), error, _stacktrace)
    
        total = self.total
        if total is not None:
            total -= 1
    
        connect = self.connect
        read = self.read
        redirect = self.redirect
        status_count = self.status
        other = self.other
        cause = "unknown"
        status = None
        redirect_location = None
    
        if error and self._is_connection_error(error):
            # Connect retry&
            if connect is False:
                raise six.reraise(type(error), error, _stacktrace)
            elif connect is not None:
                connect -= 1
    
        elif error and self._is_read_error(error):
            # Read retry&
            if read is False or not self._is_method_retryable(method):
                raise six.reraise(type(error), error, _stacktrace)
            elif read is not None:
                read -= 1
    
        elif error:
            # Other retry&
            if other is not None:
                other -= 1
    
        elif response and response.get_redirect_location():
            # Redirect retry&
            if redirect is not None:
                redirect -= 1
            cause = "too many redirects"
            redirect_location = response.get_redirect_location()
            status = response.status
    
        else:
            # Incrementing because of a server error like a 500 in
            # status_forcelist and the given method is in the allowed_methods
            cause = ResponseError.GENERIC_ERROR
            if response and response.status:
                if status_count is not None:
                    status_count -= 1
                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
                status = response.status
    
        history = self.history + (
            RequestHistory(method, url, error, status, redirect_location),
        )
    
        new_retry = self.new(
            total=total,
            connect=connect,
            read=read,
            redirect=redirect,
            status=status_count,
            other=other,
            history=history,
        )
    
        if new_retry.is_exhausted():
>           raise MaxRetryError(_pool, url, error or ResponseError(cause))
E           urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='minio', port=9000): Max retries exceeded with url: /codecov&location= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

.../local/lib/python3.12.../urllib3/util/retry.py:594: MaxRetryError

Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
Test name: test_gql_query_aggregates

self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf10b3170>
repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>
store_in_redis = None

    def test_gql_query_aggregates(self, repository, store_in_redis):
        query = base_gql_query % (
            repository.author.username,
            repository.name,
            """
            testResultsAggregates {
                totalDuration
                slowestTestsDuration
                totalFails
                totalSkips
                totalSlowTests
            }
            """,
        )
    
        result = self.gql_request(query, owner=repository.author)
    
>       assert result["owner"]["repository"]["testAnalytics"][
            "testResultsAggregates"
        ] == {
            "totalDuration": 1000.0,
            "slowestTestsDuration": 800.0,
            "totalFails": 3,
            "totalSkips": 3,
            "totalSlowTests": 1,
        }
E       AssertionError: assert None == {'slowestTestsDuration': 800.0, 'totalDuration': 1000.0, 'totalFails': 3, 'totalSkips': 3, ...}

graphql_api/tests/test_test_analytics.py:510: AssertionError

Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
Test name: test_gql_query_flake_aggregates

self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf10b3800>
repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>
store_in_redis = None

    def test_gql_query_flake_aggregates(self, repository, store_in_redis):
        query = base_gql_query % (
            repository.author.username,
            repository.name,
            """
            flakeAggregates {
                flakeRate
                flakeCount
            }
            """,
        )
    
        result = self.gql_request(query, owner=repository.author)
    
>       assert result["owner"]["repository"]["testAnalytics"]["flakeAggregates"] == {
            "flakeRate": 1 / 3,
            "flakeCount": 1,
        }
E       AssertionError: assert None == {'flakeCount': 1, 'flakeRate': 0.3333333333333333}

graphql_api/tests/test_test_analytics.py:534: AssertionError

this refactor simplifies the tests from test results aggregates and flake aggregates and adds some typing to the generate_test_results function. It also changes the arguments passed to the generate_test_results function so handling them is simpler and changes some things regarding error handling in that function.

we want the GQL endpoints to consume the cached query results from GCS and then do some filtering and ordering on top of that, and also do some extra caching of those results in redis. this commit also contains a bunch of reorganization and refactoring of the GQL code

when we miss the redis cache but find the test rollup results in GCS we want to queue up a task to cache it in redis so subsequent calls to the GQL endpoint for the same repo get to hit redis instead of GCS

ajay-sentry · 2024-11-07T21:59:02Z

graphql_api/types/flake_aggregates/flake_aggregates.py

    flake_rate: float
-    flake_rate_percent_change: float | None
+    flake_count_percent_change: float | None = None


dang you didn't like having it grouped huh

all args with defaults must come after arguments with no defaults 😭

ajay-sentry · 2024-11-07T22:49:13Z

graphql_api/types/flake_aggregates/flake_aggregates.py

+    flake_rate_percent_change: float | None = None
+
+
+def calculate_flake_aggregates(table: pl.DataFrame) -> pl.DataFrame:


is it possible to know what columns exist on the table at this point? Right now it's a little "magic"

the names of the columns are in the worker repo where it's being written, come to think of it maybe all the serialization and reading stuff can be in shared but i don't know if that would make things worst, we only read in api and we only write in worker

ajay-sentry · 2024-11-07T23:31:47Z

graphql_api/types/test_results_aggregates/test_results_aggregates.py

+        ),
+        (pl.col("total_skip_count").sum()).alias("skips"),
+        (pl.col("total_fail_count").sum()).alias("fails"),
+        ((pl.col("avg_duration") >= pl.col("avg_duration").quantile(0.95)).sum()).alias(


do we still need to do the limit 100 here?

indeed, that's being handled above but not here

ajay-sentry · 2024-11-07T23:34:31Z

graphql_api/types/test_results_aggregates/test_results_aggregates.py

+
+    merged_results: pl.DataFrame = pl.concat([past_aggregates, curr_aggregates])
+
+    merged_results = merged_results.with_columns(


It was a bit confusing for me at first to see this since I didn't realize with_columns is essentially an append or upsert depending on the column existing

i will leave a comment

ajay-sentry · 2024-11-07T23:38:31Z