LSP doesn't handle diagnostic dynamic registration correctly #2561

captain0xff · 2024-11-17T21:43:59Z

I am using https://github.com/SofusA/roslyn-language-server (which basically wraps the microsoft roslyn lsp server since sublime lsp doesn't support named pipes yet) for C# but there are no diagnostics. The reason seems to be that sublime doesn't request correct textDocument/diagnostic even after they are registered. I am attaching both the VSCode traces and sublime lsp logs below for comparison. Comparing the logs it can be seen that even though the DocumentCompilerSemantic was registered along with some other options, sublime doesn't request for them whereas VSCode does and it is this request that provides the diagnostics.

sublime-logs.txt
trace.txt

jwortmann · 2024-11-17T23:03:31Z

relevant client/registerCapability from the logs formatted as JSON

{
    "registrations": [
        {
            "id": "ee01d1d1-9547-4581-ae65-c564e7fc48c7",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "workDoneProgress": true,
                "identifier": "HotReloadDiagnostics",
                "interFileDependencies": true,
                "workspaceDiagnostics": true
            }
        },
        {
            "id": "5a7194a3-66ed-4bd4-b781-0db332716c34",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "workDoneProgress": true,
                "identifier": "enc",
                "interFileDependencies": true,
                "workspaceDiagnostics": true
            }
        },
        {
            "id": "543a1e23-2e55-4aae-b548-8058cc83c1d3",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "DocumentAnalyzerSemantic",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "8c08d1a7-e45b-48e2-a371-59820d24ee65",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "XamlDiagnostics",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "4da15f15-03b2-4d07-a5f1-638b4596717f",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "DocumentAnalyzerSyntax",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "bf607efd-91c1-4999-bfb1-23aaf15b8a3d",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "NonLocal",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "82a65d1e-40f0-4086-9bad-3b36abd1d9fd",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "DocumentCompilerSemantic",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "629fb783-9d3d-492b-8f32-e37e3778a5ca",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "identifier": "syntax",
                "interFileDependencies": true,
                "workspaceDiagnostics": false
            }
        },
        {
            "id": "73744f2e-435b-490b-8f51-029bb1173248",
            "method": "textDocument/diagnostic",
            "registerOptions": {
                "workDoneProgress": true,
                "identifier": "WorkspaceDocumentsAndProject",
                "interFileDependencies": true,
                "workspaceDiagnostics": true
            }
        }
    ]
}

This client indeed doesn't support registering the same method (capability) multiple times, instead they just get overridden:

LSP/plugin/core/types.py

Lines 538 to 541 in 9be040d

    
           stored_registration_id = self.get(registration_path) 
        
           if isinstance(stored_registration_id, str): 
        
               msg = "{} is already registered at {} with ID {}, overwriting" 
        
               debug(msg.format(capability_path, registration_path, stored_registration_id))

But even if this gets fixed, currently only one pull diagnostics request would be sent. We would need to iterate over all provided DiagnosticOptions.identifier, send a request for each one of them, and then merge the responses. The purpose seems to be that certain diagnostic streams (ids) can be slower to compute than others: microsoft/language-server-protocol#1743 (comment)

Here it would need to iterate over all identifier, but currently only one is stored:

LSP/plugin/session_buffer.py

Lines 506 to 516 in 9be040d

    
           identifier = self.get_capability("diagnosticProvider.identifier") 
        
           if identifier: 
        
               params['identifier'] = identifier 
        
           result_id = self.session.diagnostics_result_ids.get(self._last_known_uri) 
        
           if result_id is not None: 
        
               params['previousResultId'] = result_id 
        
           request_id = self.session.send_request_async( 
        
               Request.documentDiagnostic(params, view), 
        
               partial(self._on_document_diagnostic_async, version), 
        
               partial(self._on_document_diagnostic_error_async, version) 
        
           )

jwortmann · 2024-11-18T09:04:43Z

I think we should split this into two separate issues.

The design for the dynamic registration implementation is flawed. Currently it assumes that a server can only register for each method (capability) once, but this is a false assumption. Internally this is stored in a DottedDict with the capability paths used as the key names, and therefore you can only get a single value of the registration options via get_capability methods. So in reality we need to use some other data structure, or store the options in an array per capability path. And then get_capability would probably return an array as well.

Diagnostics are currently stored per session in a DiagnosticStorage object, which is also a dict and uses the URIs as the key names. Whenever new diagnostics arrive, they override the old diagnostics entirely. This works fine for push diagnostics (publishDiagnostics notification). But with the pull diagnostics model, they need to be stored as individual streams instead, and the responses need to be distributed into these streams based on the identifier given in the request. Currently pull diagnostics just reuse the same method as push diagnostics:

LSP/plugin/session_buffer.py

Lines 528 to 529 in 9be040d

    
           self.session.m_textDocument_publishDiagnostics( 
        
               {'uri': self._last_known_uri, 'diagnostics': response['items']})

It looks like both of these issues will require significant rewrites.

captain0xff · 2024-11-18T17:19:56Z

Yeah, it appears that supporting this server in a proper way will take significant time and effort. So, ig for now I will just fork this client and will try to hack together something (this way I can avoid the need for a wrapper altogether). I try contributing upstream if I can get something working.
@jwortmann are you in the discord server?

jwortmann · 2024-11-19T18:59:37Z

I don't use discord. But feel free to open a PR or discuss possible solutions here if you have some ideas.

For point 2) an approach could be to pass the identifier as an additional argument in DiagnosticStorage.add_diagnostics_async and then use it as part of a tuple with the URI for the dictionary key. For the regular publishDiagnostics notification we could use an empty string as the default value for that argument. This would ensure that diagnostic streams are handled separately and without overriding each other. Then to retrieve all diagnostics, in DiagnosticStorage.diagnostics_by_document_uri it would need to collect the diagnostics from all of the streams. This would require some custom logic to iterate over all identifiers when using self.get with the URI.

But I think it would probably be better to start with implementing point 1), which would be required anyway when we want to test the implementation for 2).

jwortmann · 2024-11-25T13:12:52Z

Here is a dirty way how point 1) could work. I haven't tested, though.

diff --git a/plugin/core/types.py b/plugin/core/types.py
index b5d5b932..105b3c58 100644
--- a/plugin/core/types.py
+++ b/plugin/core/types.py
@@ -528,6 +528,13 @@ class Capabilities(DottedDict):
     (from Server -> Client).
     """
 
+    __slots__ = ('_d', '_registrations', '_registration_options')
+
+    def __init__(self, d: dict[str, Any] | None = None) -> None:
+        super().__init__(d)
+        self._registrations: dict[str, set[str]] = {}
+        self._registration_options: dict[str, Any] = {}
+
     def register(
         self,
         registration_id: str,
@@ -535,10 +542,8 @@ class Capabilities(DottedDict):
         registration_path: str,
         options: dict[str, Any]
     ) -> None:
-        stored_registration_id = self.get(registration_path)
-        if isinstance(stored_registration_id, str):
-            msg = "{} is already registered at {} with ID {}, overwriting"
-            debug(msg.format(capability_path, registration_path, stored_registration_id))
+        self._registrations.setdefault(capability_path, set()).add(registration_id)
+        self._registration_options[registration_id] = options
         self.set(capability_path, options)
         self.set(registration_path, registration_id)
 
@@ -548,13 +553,15 @@ class Capabilities(DottedDict):
         capability_path: str,
         registration_path: str
     ) -> dict[str, Any] | None:
+        self._registrations.get(capability_path, set()).discard(registration_id)
+        self._registration_options.pop(registration_id, None)
         stored_registration_id = self.get(registration_path)
         if not isinstance(stored_registration_id, str):
             debug("stored registration ID at", registration_path, "is not a string")
             return None
         elif stored_registration_id != registration_id:
-            msg = "stored registration ID ({}) is not the same as the provided registration ID ({})"
-            debug(msg.format(stored_registration_id, registration_id))
+            # msg = "stored registration ID ({}) is not the same as the provided registration ID ({})"
+            # debug(msg.format(stored_registration_id, registration_id))
             return None
         else:
             discarded = self.get(capability_path)
@@ -562,6 +569,9 @@ class Capabilities(DottedDict):
             self.remove(registration_path)
             return discarded
 
+    def get_all(self, path: str) -> list[Any]:
+        return [self._registration_options[registration_id] for registration_id in self._registrations.get(path, [])]
+
     def assign(self, d: ServerCapabilities) -> None:
         textsync = normalize_text_sync(d.pop("textDocumentSync", None))
         super().assign(cast(dict, d))
diff --git a/plugin/session_buffer.py b/plugin/session_buffer.py
index e0d4cb49..003ad962 100644
--- a/plugin/session_buffer.py
+++ b/plugin/session_buffer.py
@@ -271,6 +271,11 @@ class SessionBuffer:
         value = self.capabilities.get(capability_path)
         return value if value is not None else self.session.capabilities.get(capability_path)
 
+    def get_capability_2(self, capability_path: str) -> list[Any]:
+        if self.session.config.is_disabled_capability(capability_path):
+            return []
+        return self.capabilities.get_all(capability_path) + self.session.capabilities.get_all(capability_path)
+
     def has_capability(self, capability_path: str) -> bool:
         value = self.get_capability(capability_path)
         return value is not False and value is not None

This keeps a list of the registration IDs for each registered method. It's ugly because it still keeps the old DottedDict for the "single registration" capabilities, to maintain some form of backwards compatibility with the current get_capability method which expects only a single registration. Otherwise this would probably need to be rewritten completely from scratch. And I'm unsure if there is any good way to combine the DottedDict (allowing to query sub-paths of the capabilities) and having multiple registrations for the same method.
Here get_capability_2 (and get_all) would require the exact path used in the registration and doesn't accept sub-paths like diagnosticProvider.identifier. So for the diagnostics request it would need to do something like

_params: DocumentDiagnosticParams = {'textDocument': text_document_identifier(view)}
identifiers = set()
for registration in self.get_capability_2('diagnosticProvider'):
    identifiers.add(registration.get('identifier', ''))
for identifier in identifiers:
    params = _params.copy()
    if identifier:
        params['identifier'] = identifier
    # ... send request

This should also work if the diagnostics capability was registered without an identifier. After that the request id and pending requests need to be stored per identifier. And then the responses need to be distributed and stored into the respective diagnostic streams as mentioned in the previous comment.

captain0xff · 2024-11-26T20:04:53Z

I started to work on this but then got busy irl due to exams. Thank you for the patch. I will take a look once my exams are over.

jwortmann · 2024-11-27T08:47:55Z

I have something in the works / in my mind how the diagnostic streams could be implemented. So if you have a bit of patience I can probably push it on a testing branch when it's ready, and then it would be helpful if you could test it (I don't have any C# project at hand and iirc I have only used C# once for a few days and that was almost 10 years ago). The implementation for the diagnostic streams with dynamic registrations might be a bit tricky and I'm unsure whether my ideas would be the best approach, so if you or rchl or predragnikolic have some ideas too, feel free to leave some notes. Good luck for your exams!

captain0xff · 2024-11-29T23:02:16Z

Tried your patches and the good news is that all the diagnostic requests are sent correctly.
But the bad news is that there are still no diagnostics. Even though the server is sending the diagnostics (see the ; expected message). I am attaching the logs below.
sublime-logs.txt

captain0xff · 2024-11-29T23:05:26Z

Oh, it seems that the diagnostics is appearing but vanishing almost instantly for some reason.

captain0xff · 2024-11-29T23:09:08Z

It becomes more clear when I make another mistake in the same line. And I think I have an idea why this is happening but don't have enough understanding of the code base to solve it.

Screencast.From.2024-11-30.04-36-48.mp4

jwortmann · 2024-11-30T08:23:09Z

That is expected. Currently the diagnostic responses would just override each other.

You could try wether #2564 works. To be honest, the capability registration and the diagnostic handling should be rewritten from scratch in a clean way. DottedDict was nice to have, but it just doesn't work. But first someone needs to find a better design which can handle multiple registrations for the same capability method.

captain0xff changed the title ~~LSP doesn't handle dynamic registration correctly~~ LSP doesn't handle diagnostic dynamic registration correctly Nov 17, 2024

jwortmann added the bug label Nov 18, 2024

jwortmann linked a pull request Nov 30, 2024 that will close this issue

Fix dynamic registration and diagnostic streams #2564

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSP doesn't handle diagnostic dynamic registration correctly #2561

LSP doesn't handle diagnostic dynamic registration correctly #2561

captain0xff commented Nov 17, 2024 •

edited

Loading

jwortmann commented Nov 17, 2024

jwortmann commented Nov 18, 2024

captain0xff commented Nov 18, 2024

jwortmann commented Nov 19, 2024 •

edited

Loading

jwortmann commented Nov 25, 2024

captain0xff commented Nov 26, 2024

jwortmann commented Nov 27, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading

jwortmann commented Nov 30, 2024

LSP doesn't handle diagnostic dynamic registration correctly #2561

LSP doesn't handle diagnostic dynamic registration correctly #2561

Comments

captain0xff commented Nov 17, 2024 • edited Loading

jwortmann commented Nov 17, 2024

jwortmann commented Nov 18, 2024

captain0xff commented Nov 18, 2024

jwortmann commented Nov 19, 2024 • edited Loading

jwortmann commented Nov 25, 2024

captain0xff commented Nov 26, 2024

jwortmann commented Nov 27, 2024 • edited Loading

captain0xff commented Nov 29, 2024 • edited Loading

captain0xff commented Nov 29, 2024 • edited Loading

captain0xff commented Nov 29, 2024 • edited Loading

jwortmann commented Nov 30, 2024

captain0xff commented Nov 17, 2024 •

edited

Loading

jwortmann commented Nov 19, 2024 •

edited

Loading

jwortmann commented Nov 27, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading

captain0xff commented Nov 29, 2024 •

edited

Loading