LSP: fix "call hierarchy" across files (chapel-lang#26040)

In the following program, trying to get the call hierarchy for `writeln` did not work on `main`. ```Chapel writeln("Hello, world!"); ``` There were a few issues with this. The first issue is that the the "index" that was fed through a `CallHierarchyItem` to LSP and back to `CLS` was an index into a single file's list of instantiations. Thus, if the function was instantiated in one place, but defined in another, the index would not be valid (or worse, be incorrect). This PR changes this by changing the way typed signatures are mapped to JSON-compatible identifiers; they are instead registered with a `ContextContainer` object, getting a unique ID. The unique ID is then used as JSON data, and can be used to access the same `TypedSignature` object when the JSON is being deserialized. This, however, was not enough. The next problem was that we currently create a new `ContextContainer` per file (to handle the case in which multiple files with the same name / module name exist). As a result of this, `writeln` that is called in module `A` and defined in module `B` will have its instantiation computed using `A`'s context, but its AST and other information will use `B`'s context. This is troublesome. This PR adjusts the situation by changing the `file -> FileInfo` mapping to be a `(file, context id) -> FileInfo` mapping. This way, the same file can have several contexts (e.g., if a standard library file is loaded from two distinct files, but both named `A.chpl`). By default (e.g., when opening a file), the `(file, "None")` key is used (since no particular context was requested). However, if a file A needs to show an instantiation (via file info) from file B, we create a new mapping `(fileB, fileA.context.id) -> fileInfoB`, which is created from the same context. This ensures that there is a file information object for file B with the same context as that of file A. When a call hierarchy item is constructed, it now specifies what context was used to create it. Thus, if other CallHierarchyItems point to other files, these other files are opened with the same context. To communicate the context over JSON, I employ the same approach as I did with typed signatures (register a context object with unique ID, use the ID to marshal the context to and from JSON). ~~To fix the case of `writeln` in particular, I had to relax an assertion from the `_owned`/`_shared_` resolution PR chapel-lang#25617. This led to some simplified code in any case, which is great. I believe @arezaii is fine with this change.~~ This was separately merged in chapel-lang#26078. A bug that I'm surprised we didn't find prior to this, which is fixed in this PR, is that vector unmarshalling for Python interop allocates a vector of the Python list's size, then pushes _additional elements_ instead of setting the already-allocated locations. This ended up e.g., pushing an empty string for each "real" string a user passed in a list. This led to some odd interactions with Dyno's module searching (an empty string as an input file led to "." being added to the module search path, which we do not want). Reviewed by @jabraham17 -- thanks! ## Testing - [x] LSP tests - [x] new call hierarchy tests - [x] dyno tests
lydia-duncan · Oct 14, 2024 · 55f5489 · 55f5489
2 parents 00b81be + 70ad280
commit 55f5489
Show file tree

Hide file tree

Showing 6 changed files with 344 additions and 30 deletions.
diff --git a/frontend/lib/resolution/resolution-queries.cpp b/frontend/lib/resolution/resolution-queries.cpp
@@ -2150,8 +2150,8 @@ ApplicabilityResult instantiateSignature(ResolutionContext* rc,
   const TypedFnSignature* parentSignature = sig->parentFn();
   if (parentSignature) {
     for (auto up = parentSignature; up; up = up->parentFn()) {
-      CHPL_ASSERT(!up->needsInstantiation());
       if (up->needsInstantiation()) {
+        CHPL_UNIMPL("parent function needs instantiation");
         return ApplicabilityResult::failure(sig->id(), FAIL_CANDIDATE_OTHER);
       }
     }

diff --git a/tools/chapel-py/src/python-types.h b/tools/chapel-py/src/python-types.h
@@ -92,7 +92,7 @@ template <typename T>
 std::vector<T> unwrapVector(ContextObject* CONTEXT, PyObject* vec) {
   std::vector<T> toReturn(PyList_Size(vec));
   for (ssize_t i = 0; i < PyList_Size(vec); i++) {
-    toReturn.push_back(PythonReturnTypeInfo<T>::unwrap(CONTEXT, PyList_GetItem(vec, i)));
+    toReturn[i] = PythonReturnTypeInfo<T>::unwrap(CONTEXT, PyList_GetItem(vec, i));
   }
   return toReturn;
 }

diff --git a/tools/chpl-language-server/src/chpl-language-server.py b/tools/chpl-language-server/src/chpl-language-server.py
@@ -499,6 +499,8 @@ def __init__(self, file: str, config: Optional["WorkspaceConfig"]):
         self.context: chapel.Context = chapel.Context()
         self.file_infos: List["FileInfo"] = []
         self.global_uses: Dict[str, List[References]] = defaultdict(list)
+        self.instantiation_ids: Dict[chapel.TypedSignature, str] = {}
+        self.instantiation_id_counter = 0
 
         if config:
             file_config = config.for_file(file)
@@ -508,6 +510,27 @@ def __init__(self, file: str, config: Optional["WorkspaceConfig"]):
 
         self.context.set_module_paths(self.module_paths, self.file_paths)
 
+    def register_signature(self, sig: chapel.TypedSignature) -> str:
+        """
+        The language server can't send over typed signatures directly for
+        situations such as call hierarchy items (but we need to reason about
+        instantiations). Instead, keep a global unique ID for each signature,
+        and use that to identify them.
+        """
+        if sig in self.instantiation_ids:
+            return self.instantiation_ids[sig]
+
+        self.instantiation_id_counter += 1
+        uid = str(self.instantiation_id_counter)
+        self.instantiation_ids[sig] = uid
+        return uid
+
+    def retrieve_signature(self, uid: str) -> Optional[chapel.TypedSignature]:
+        for sig, sig_uid in self.instantiation_ids.items():
+            if sig_uid == uid:
+                return sig
+        return None
+
     def new_file_info(
         self, uri: str, use_resolver: bool
     ) -> Tuple["FileInfo", List[Any]]:
@@ -1007,7 +1030,9 @@ def __init__(self, config: CLSConfig):
         super().__init__("chpl-language-server", "v0.1")
 
         self.contexts: Dict[str, ContextContainer] = {}
-        self.file_infos: Dict[str, FileInfo] = {}
+        self.context_ids: Dict[ContextContainer, str] = {}
+        self.context_id_counter = 0
+        self.file_infos: Dict[Tuple[str, Optional[str]], FileInfo] = {}
         self.configurations: Dict[str, WorkspaceConfig] = {}
 
         self.use_resolver: bool = config.get("resolver")
@@ -1107,17 +1132,28 @@ def get_context(self, uri: str) -> ContextContainer:
         for file in context.file_paths:
             self.contexts[file] = context
         self.contexts[path] = context
+        self.context_id_counter += 1
+        self.context_ids[context] = str(self.context_id_counter)
 
         return context
 
+    def retrieve_context(self, context_id: str) -> Optional[ContextContainer]:
+        for ctx, cid in self.context_ids.items():
+            if cid == context_id:
+                return ctx
+        return None
+
     def eagerly_process_all_files(self, context: ContextContainer):
         cfg = context.config
         if cfg:
             for file in cfg.files:
                 self.get_file_info("file://" + file, do_update=False)
 
     def get_file_info(
-        self, uri: str, do_update: bool = False
+        self,
+        uri: str,
+        do_update: bool = False,
+        context_id: Optional[str] = None,
     ) -> Tuple[FileInfo, List[Any]]:
         """
         The language server maintains a FileInfo object per file. The FileInfo
@@ -1128,19 +1164,34 @@ def get_file_info(
         creating one if it doesn't exist. If do_update is set to True,
         then the FileInfo's index is rebuilt even if it has already been
         computed. This is useful if the underlying file has changed.
+
+        Most of the time, we will create a new context for a given URI. When
+        requested, however, context_id will be used to create a FileInfo
+        for a specific context. This is useful if e.g., file A wants to display
+        an instantiation in file B.
         """
 
         errors = []
 
-        if uri in self.file_infos:
-            file_info = self.file_infos[uri]
+        fi_key = (uri, context_id)
+        if fi_key in self.file_infos:
+            file_info = self.file_infos[fi_key]
             if do_update:
                 errors = file_info.context.advance()
         else:
-            file_info, errors = self.get_context(uri).new_file_info(
-                uri, self.use_resolver
-            )
-            self.file_infos[uri] = file_info
+            if context_id:
+                context = self.retrieve_context(context_id)
+                assert context
+            else:
+                context = self.get_context(uri)
+
+            file_info, errors = context.new_file_info(uri, self.use_resolver)
+            self.file_infos[fi_key] = file_info
+
+            # Also make this the "default" context for this file in case we
+            # open it.
+            if (uri, None) not in self.file_infos:
+                self.file_infos[(uri, None)] = file_info
 
         # filter out errors that are not related to the file
         cur_path = uri[len("file://") :]
@@ -1396,7 +1447,8 @@ def sym_to_call_hierarchy_item(
         """
         loc = location_to_location(sym.location())
 
-        inst_idx = -1
+        inst_id = None
+        context_id = None
 
         return CallHierarchyItem(
             name=sym.name(),
@@ -1405,11 +1457,11 @@ def sym_to_call_hierarchy_item(
             uri=loc.uri,
             range=loc.range,
             selection_range=location_to_range(sym.name_location()),
-            data=[sym.unique_id(), inst_idx],
+            data=[sym.unique_id(), inst_id, context_id],
         )
 
     def fn_to_call_hierarchy_item(
-        self, sig: chapel.TypedSignature
+        self, sig: chapel.TypedSignature, caller_context: ContextContainer
     ) -> CallHierarchyItem:
         """
         Like sym_to_call_hierarchy_item, but for function instantiations.
@@ -1419,8 +1471,8 @@ def fn_to_call_hierarchy_item(
         """
         fn: chapel.Function = sig.ast()
         item = self.sym_to_call_hierarchy_item(fn)
-        fi, _ = self.get_file_info(item.uri)
-        item.data[1] = fi.index_of_instantiation(fn, sig)
+        item.data[1] = caller_context.register_signature(sig)
+        item.data[2] = self.context_ids[caller_context]
 
         return item
 
@@ -1433,16 +1485,17 @@ def unpack_call_hierarchy_item(
             item.data is None
             or not isinstance(item.data, list)
             or not isinstance(item.data[0], str)
-            or not isinstance(item.data[1], int)
+            or not isinstance(item.data[1], Optional[str])
+            or not isinstance(item.data[2], Optional[str])
         ):
             self.show_message(
                 "Call hierarchy item contains missing or invalid additional data",
                 MessageType.Error,
             )
             return None
-        uid, idx = item.data
+        uid, inst_id, ctx = item.data
 
-        fi, _ = self.get_file_info(item.uri)
+        fi, _ = self.get_file_info(item.uri, context_id=ctx)
 
         # TODO: Performance:
         # Once the Python bindings supports it, we can use the
@@ -1456,11 +1509,7 @@ def unpack_call_hierarchy_item(
             # We don't handle that here.
             return None
 
-        instantiation = None
-        if idx != -1:
-            instantiation = fi.instantiation_at_index(fn, idx)
-        else:
-            instantiation = fi.concrete_instantiation_for(fn)
+        instantiation = fi.context.retrieve_signature(inst_id)
 
         return (fi, fn, instantiation)
 
@@ -2000,7 +2049,10 @@ async def prepare_call_hierarchy(
 
         # Oddly, returning multiple here makes for no child nodes in the VSCode
         # UI. Just take one signature for now.
-        return next(([ls.fn_to_call_hierarchy_item(sig)] for sig in sigs), [])
+        return next(
+            ([ls.fn_to_call_hierarchy_item(sig, fi.context)] for sig in sigs),
+            [],
+        )
 
     @server.feature(CALL_HIERARCHY_INCOMING_CALLS)
     async def call_hierarchy_incoming(
@@ -2046,7 +2098,7 @@ async def call_hierarchy_incoming(
             if isinstance(called_fn, str):
                 item = ls.sym_to_call_hierarchy_item(hack_id_to_node[called_fn])
             else:
-                item = ls.fn_to_call_hierarchy_item(called_fn)
+                item = ls.fn_to_call_hierarchy_item(called_fn, fi.context)
 
             to_return.append(
                 CallHierarchyIncomingCall(
@@ -2070,7 +2122,7 @@ async def call_hierarchy_outgoing(
         if unpacked is None:
             return None
 
-        _, fn, instantiation = unpacked
+        fi, fn, instantiation = unpacked
 
         outgoing_calls: Dict[chapel.TypedSignature, List[chapel.FnCall]] = (
             defaultdict(list)
@@ -2093,7 +2145,7 @@ async def call_hierarchy_outgoing(
 
         to_return = []
         for called_fn, calls in outgoing_calls.items():
-            item = ls.fn_to_call_hierarchy_item(called_fn)
+            item = ls.fn_to_call_hierarchy_item(called_fn, fi.context)
             to_return.append(
                 CallHierarchyOutgoingCall(
                     item,

diff --git a/tools/chpl-language-server/test/basic.py b/tools/chpl-language-server/test/basic.py
@@ -140,6 +140,40 @@ async def test_go_to_definition_use_standard(client: LanguageClient):
         await check_goto_decl_def_module(client, doc, pos((2, 8)), mod_Time)
 
 
+@pytest.mark.asyncio
+async def test_go_to_definition_use_across_modules(client: LanguageClient):
+    """
+    Ensure that go-to-definition works on symbols that reference other modules
+    """
+
+    fileA = """
+            module A {
+              var x = 42;
+            }
+            """
+    fileB = """
+            module B {
+              use A;
+              var y = x;
+            }
+            """
+
+    async def check(docs):
+        docA = docs("A")
+        docB = docs("B")
+
+        await check_goto_decl_def_module(client, docB, pos((1, 6)), docA)
+        await check_goto_decl_def(
+            client, docB, pos((2, 10)), (docA, pos((1, 6)))
+        )
+
+    async with source_files(client, A=fileA, B=fileB) as docs:
+        await check(docs)
+
+    async with unrelated_source_files(client, A=fileA, B=fileB) as docs:
+        await check(docs)
+
+
 @pytest.mark.asyncio
 async def test_go_to_definition_standard_rename(client: LanguageClient):
     """