Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: go/scanner: stop adding source file directory to relative file name in //line comment #69689

Open
podtserkovskiy opened this issue Sep 28, 2024 · 32 comments
Labels
Milestone

Comments

@podtserkovskiy
Copy link
Contributor

podtserkovskiy commented Sep 28, 2024

Proposal Details

The current behaviour

The go/scanner.Scanner.updateLineInfo prepends name of *.cgo1.go file directory to the relative path of the original CGo file taken from //line comment.

For example:

$ go tool cgo -objdir cgo_objdir -trimpath $(pwd) src/test1.go 
# Generates file `cgo_objdir/test1.cgo1.go`
# The file have line: `//line src/test1.go:1:1`

Let's parse it then with parser.ParseFile(fset, "cgo_objdir/test1.cgo1.go", nil, parser.ParseComments)and check fset value.

The issue is that fset will contain go/token.File with infos[0].lineInfo.Filename = "cgo_objdir/src/test1.go".

The path "cgo_objdir/src/test1.go" in lineInfo.Filename is invalid.

There was an attempt to fix this behaviour, but it was rollbacked in CL 127658 due to Issue 26671

When this becomes a problem?

In build systems like Buck2 or Bazel the build actions may be executed on different hosts, so we can't bake absolute paths inside build artifacts.

We have a way to prevent this by adding -trimpath $(pwd) to build commands.
However the go/scanner will compute invalid filepath in this case as shown in the above.

A specific example

The cgocall analyser in golang.org/x/tools is trying to access the original source file by fset.Position(raw.Pos()).Filename, but fails when relative filepaths used because the filename is invalid.

This doesn't allow seamless integration of Buck2/Bazel with code analysis tools.

Possible solutions

Option 1: Bold

Rollback CL 127658, but this probably can break some existing linters or other programs like it was in Issue 26671.

Option 2: Safe

Rollback CL 127658 and add a GODEBUG constant to minimise potential damage to existing programs.

  • goscannerprefixlinedir = "1" - old behaviour
  • goscannerprefixlinedir != "1" - new behaviour

I'm not sure what the default behaviour of should be for go1.24.
I'm happy with any option as we can set GODEBUG env-var before running code analysers.

EDIT: updated GODEBUG variable name

@gopherbot gopherbot added this to the Proposal milestone Sep 28, 2024
@gabyhelp
Copy link

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Sep 29, 2024
@ianlancetaylor
Copy link
Contributor

CC @griesemer @findleyr

@podtserkovskiy
Copy link
Contributor Author

I've created a test project podtserkovskiy/issue-69689 with go and cgo testcases for popular linters to make sure linters work for the same before and after my change.
I've reverted CL 127658 on Go release-branch.go1.23 and ran popular linters golangci_lint (with all lints enabled including unparam, unconvert and unused) and staticcheck on podtserkovskiy/issue-69689.

I compared results against the linters compiled with non-modified Go. I downloaded the linters from Homebrew repository.

golangci_lint before my changes

issue-69689 git:(main) golangci-lint run --enable-all
WARN The linter 'execinquery' is deprecated (since v1.58.0) due to: The repository of the linter has been archived by the owner.  
WARN The linter 'exportloopref' is deprecated (since v1.60.2) due to: Since Go1.22 (loopvar) this linter is no longer relevant. Replaced by copyloopvar. 
WARN The linter 'gomnd' is deprecated (since v1.58.0) due to: The linter has been renamed. Replaced by mnd. 
rand/rand.go:11:18: unused-parameter: parameter 'j' seems to be unused, consider removing or renaming it as _ (revive)
func Seed(i int, j int) { // unparam
                 ^
rand/rand.go:12:9: unnecessary conversion (unconvert)
        i = int(i) // unconvert
               ^
rand/rand.go:9:6: func `unusedFunc` is unused (unused)
func unusedFunc() {} // unused func
     ^
rand/rand.go:14:13: SA1004: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (staticcheck)
        time.Sleep(1)
                   ^
rand_cgo/rand_cgo.go:15:18: unused-parameter: parameter 'j' seems to be unused, consider removing or renaming it as _ (revive)
func Seed(i int, j int) { // unparam
                 ^
rand_cgo/rand_cgo.go:16:9: unnecessary conversion (unconvert)
        i = int(i) // unconvert
               ^
rand_cgo/rand_cgo.go:18:13: SA1004: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (staticcheck)
        time.Sleep(1) // SA1004
                   ^

golangci_lint after my changes

➜  issue-69689 git:(main) GOROOT=$PWD/../go PATH=$PWD/../go/bin:$PATH ./../golangci-lint/golangci-lint run --enable-all
WARN The linter 'exportloopref' is deprecated (since v1.60.2) due to: Since Go1.22 (loopvar) this linter is no longer relevant. Replaced by copyloopvar. 
WARN The linter 'gomnd' is deprecated (since v1.58.0) due to: The linter has been renamed. Replaced by mnd. 
WARN The linter 'execinquery' is deprecated (since v1.58.0) due to: The repository of the linter has been archived by the owner.  
rand/rand.go:11:18: unused-parameter: parameter 'j' seems to be unused, consider removing or renaming it as _ (revive)
func Seed(i int, j int) { // unparam
                 ^
rand/rand.go:12:9: unnecessary conversion (unconvert)
        i = int(i) // unconvert
               ^
rand/rand.go:9:6: func `unusedFunc` is unused (unused)
func unusedFunc() {} // unused func
     ^
rand/rand.go:14:13: SA1004: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (staticcheck)
        time.Sleep(1)
                   ^
rand_cgo/rand_cgo.go:15:18: unused-parameter: parameter 'j' seems to be unused, consider removing or renaming it as _ (revive)
func Seed(i int, j int) { // unparam
                 ^
rand_cgo/rand_cgo.go:16:9: unnecessary conversion (unconvert)
        i = int(i) // unconvert
               ^
rand_cgo/rand_cgo.go:18:13: SA1004: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (staticcheck)
        time.Sleep(1) // SA1004
                   ^

staticcheck before my changes

➜  issue-69689 git:(main) staticcheck ./...
rand/rand.go:9:6: func unusedFunc is unused (U1000)
rand/rand.go:14:13: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (SA1004)
rand_cgo/rand_cgo.go:18:13: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (SA1004)

staticcheck after my changes

➜  issue-69689 git:(main) GOROOT=$PWD/../go PATH=$PWD/../go/bin:$PATH /tmp/staticcheck ./...
rand/rand.go:9:6: func unusedFunc is unused (U1000)
rand/rand.go:14:13: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (SA1004)
rand_cgo/rand_cgo.go:18:13: sleeping for 1 nanoseconds is probably a bug; be explicit if it isn't (SA1004)

Conclusion

As we can see from these examples #26671 doesn't reproduce on the linters anymore if we rollback CL 127658.
Therefore, it should be reasonable to change go/scanner.Scanner.updateLineInfo behaviour by removing CL 127658

However, there's a chance we may break some unknown to me existing programmes. I think it's reasonable to add a GODEBUG option to enable existing programs temporary opt-out from the go/scanner behaviour change.

Since the blast-radius shouldn't be large I think we can remove this GODEBUG option in go1.26 as likely nobody would need it by this time.

I'll make and publish a CL shortly.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/618276 mentions this issue: go/scanner: Preserve relative filenames from //line comments.

@podtserkovskiy
Copy link
Contributor Author

@griesemer @findleyr @ianlancetaylor Could you please take a look on the CL 618276?

@podtserkovskiy
Copy link
Contributor Author

Could you please take a look on this proposal, if would be very grateful if we can include this fix in go1.24 release.
I'm sorry for pinging you, but there's no much time left before the freeze.
@griesemer @findleyr @ianlancetaylor @prattmic @mknyszek @timothy-king @adonovan @mdempsky

@findleyr
Copy link
Contributor

findleyr commented Nov 4, 2024

Sorry you haven't gotten much traction on this issue. I suspect it's because there's no easy solution, and this has been a problem for a long time. It's likely that we won't reach a satisfactory resolution for 1.24.

I wasn't familiar with the context of this issue, but it strikes me that go/scanner should really not be in the business of transforming relative paths to absolute paths. I agree it would be more correct to change that behavior, though we may not be able to do so.

Can you clarify exactly how you are encountering the absolute paths? In which artifacts are they written? You say "we can't bake absolute paths inside build artifacts", but as far as I can tell from the issue, the paths are only modified inside the token.Files in memory of a process using go/scanner. I suspect that you're talking about compiler output, but it's worth being precise here. It may be that we can alter the behavior of go/scanner for the compiler, and leave it unmodified inside go/parser.ParseFile, which is how most tooling would consume its output.

@adonovan
Copy link
Member

adonovan commented Nov 4, 2024

I agree that go/scanner should not be transforming paths, and that a godebug seems like a reasonable way to make this sound but potentially risky change (though each godebug parameter at least doubles the configuration space for testing).

I suspect that you're talking about compiler output, but it's worth being precise here. It may be that we can alter the behavior of go/scanner for the compiler, and leave it unmodified inside go/parser.ParseFile, which is how most tooling would consume its output.

I'm confused; the compiler doesn't use go/scanner.

@findleyr
Copy link
Contributor

findleyr commented Nov 4, 2024

I'm confused; the compiler doesn't use go/scanner.

Indeed you're right. I thought that there was a use case somewhere.

The question remains: which artifacts hold the absolute paths?

@adonovan
Copy link
Member

adonovan commented Nov 4, 2024

which artifacts hold the absolute paths?

The source file contains //line relative/file.go:1 and the existing hack causes the token.File to contain "/cwd/relative/file/go", implicitly incorporating the process's working directory. Of course you only see the modified file names if you use FileSet.Position(pos), whereas gopls tries its hardest to use only FileSet.PositionFor(pos, false), bypassing the //line corrections.

@findleyr
Copy link
Contributor

findleyr commented Nov 4, 2024

@adonovan I understand that this is a problem for token.File (above, I said "as far as I can tell from the issue, the paths are only modified inside the token.Files in memory of a process using go/scanner"). What I don't understand is the exact mechanism that causes a problem in Bazel. The original report says In "build systems like Buck2 or Bazel the build actions may be executed on different hosts, so we can't bake absolute paths inside build artifacts". What are the artifacts? The cgo files themselves contain relative paths.

@adonovan
Copy link
Member

adonovan commented Nov 4, 2024

The original report says In "build systems like Buck2 or Bazel the build actions may be executed on different hosts, so we can't bake absolute paths inside build artifacts". What are the artifacts? The cgo files themselves contain relative paths.

Oh, I took that to mean that there are some go/token-based tools that are used during a Bazel build that read source files (including those generated by cgo) and include the file/line/column information in their output, causing them to inadvertently depend on the process's working directory, breaking determinism. (Technically, hermetism: such programs are deterministic on any given machine, but incorporate information about the build worker machine in their output.) This was a pretty common problem with Blaze and Forge until we tracked down and fixed every nondeterministic and non-hermetic program (by running them twice and failing if the results were not consistent).

@findleyr
Copy link
Contributor

findleyr commented Nov 4, 2024

Oh, I took that to mean that there are some go/token-based tools that are used during a Bazel build that read source files

Sure, but I wonder what those tools are, and whether they can be updated to write relative paths. Perhaps they can't, if they have some deep integration with go/parser (and go/parser doesn't expose the way it calls go/scanner), but it's worth understanding.

It is indeed inconvenient and confusing that go/scanner translates paths in this way, but the bug is arguably in the tool that generates the artifacts with absolute paths, or the system that consumes those absolute paths in an inconsistent context. From go/scanner's perspective, inside a running process, the transformation is valid.

Before advocating for a solution, I want to better understand the problem.

@adonovan
Copy link
Member

adonovan commented Nov 4, 2024

Sure, but I wonder what those tools are, and whether they can be updated to write relative paths. Perhaps they can't...

I'm sure they can, and I think this proposal is suggesting that they should so that we can remove this workaround from the scanner. The GODEBUG is an acknowledgement that there is risk and cost.

It is indeed inconvenient and confusing that go/scanner translates paths in this way, but the bug is arguably in the tool that generates the artifacts with absolute paths, or the system that consumes those absolute paths in an inconsistent context.

I think we are all in agreement on this.

From go/scanner's perspective, inside a running process, the transformation is valid.

I don't think that's true: given that the parser API takes as parameters both the content of the file and its apparent file name, the scanner has no justification for interacting with the file system.

@findleyr
Copy link
Contributor

findleyr commented Nov 4, 2024

I'm sure they can, and I think this proposal is suggesting that they should so that we can remove this workaround from the scanner.

Sorry, I don't understand this comment. I believe the proposal is suggesting that tools relying on absolute paths should update so that the path translation in go/scanner can be removed. Specifically, consider this quote: "The cgocall analyser in golang.org/x/tools is trying to access the original source file by fset.Position(raw.Pos()).Filename, but fails when relative filepaths used because the filename is invalid." That is saying the cgocall analyzer fails when relative paths are used in the token.File. It is not saying that the cgocall analyser is failing in the distributed bazel build. I'm asking what is failing in the distributed bazel build.

I don't think that's true: given that the parser API takes as parameters both the content of the file and its apparent file name, the scanner has no justification for interacting with the file system.

How is the scanner interacting with the file system? IIUC it is simply translating paths from relative to absolute, based on the directory of the filepath being parsed. There is no I/O involved.

@adonovan
Copy link
Member

adonovan commented Nov 4, 2024

Sorry, I now realize that all this time, thanks to this comment:

	// Put a relative filename in the current directory.

I was under the misapprehension that the scanner was using the process's current directory to absolutize relative file names in line directives, when in fact it is lexically joining the parent directory of the file containing the line directive (or the purported filename provided in the call to ParseFile).

The compiler documentation doesn't give any guidance on the interpretation of relative file names in line directives, and one can imagine a number of possibilities. Given that the scanner behavior should not depend on os.Getwd, and that the working directory of the tool that wrote the //line directive is unknowable, the only interpretation that makes sense to me is the one that is implemented (though possibly without the filepath.Clean, which can give the wrong answer in the presence of symbolic links). Unless the line directives are lexically relativized to the source file's nominal directory, the apparent file name reported by token.File.Name would be unrelativized, making it implicitly relative to the process's working directory.

(I had been nodding my head in agreement with all your comments all this time not realizing I was saying something opposite to what I meant.)

@timothy-king
Copy link
Contributor

  1. If a GODEBUG is introduced 1.24, the earliest it can be removed is 1.28. The minimum number of versions before deprecation is 4 (https://go.dev/doc/godebug).
  2. Given the conclusions in the comment proposal: go/scanner: stop adding source file directory to relative file name in //line comment  #69689 (comment) I am not really sure a GODEBUG is warranted, but the existence of go/scanner: file's dir lost in go1.11 #26671 is an okay case for introducing a GODEBUG.
  3. If we suspect this will break tools like the 1.11 attempt did (go/scanner: file's dir lost in go1.11 #26671), I suspect we want to introduce the GODEBUG as defaulting to its old behavior for one release. This lets users that want to switch away do this by taking active steps now, and gives some more runaway to tool authors before changing the default on them.
  4. Before we commit to the strategy of introducing and deprecating a GODEBUG in 4 releases, we should understand what we expect users to do before the deprecation. Accidentally introducing a GODEBUG forever would be an unfortunate outcome.

(Apologies if this is obvious to others. I'm catching up.)

@podtserkovskiy
Copy link
Contributor Author

@findleyr

Can you clarify exactly how you are encountering the absolute paths? In which artifacts are they written? You say "we can't bake absolute paths inside build artifacts", but as far as I can tell from the issue, the paths are only modified inside the token.Files in memory of a process using go/scanner. I suspect that you're talking about compiler output, but it's worth being precise here. It may be that we can alter the behavior of go/scanner for the compiler, and leave it unmodified inside go/parser.ParseFile, which is how most tooling would consume its output.

The cmd/cgo makes paths absolute, but this behaviour can be altered by-timpath=$(pwd) and Bazel does that.

The compiler works fine with both relative and absolute paths, that's not a problem at all. The compiler doesn't use go/scanner

The problem is tools like cgocall from golang.org/x/tools as they depend on go/scanner and the following code fset.Position(raw.Pos()).Filename doesn't work correct due to the behaviour of go/scanner I'd like to fix.

@podtserkovskiy
Copy link
Contributor Author

@adonovan

The compiler documentation doesn't give any guidance on the interpretation of relative file names in line directives, and one can imagine a number of possibilities. Given that the scanner behavior should not depend on os.Getwd, and that the working directory of the tool that wrote the //line directive is unknowable, the only interpretation that makes sense to me is the one that is implemented (though possibly without the filepath.Clean, which can give the wrong answer in the presence of symbolic links). Unless the line directives are lexically relativized to the source file's nominal directory, the apparent file name reported by token.File.Name would be unrelativized, making it implicitly relative to the process's working directory.

Yes go/scanner doesn't use os.Getwd, it just assumes that when we parse file abc/src1.go the relative path provided by //line comment is relative to the directory abc. As I understand from this code the compiler doesn't do this assumption.

@podtserkovskiy
Copy link
Contributor Author

@timothy-king

If a GODEBUG is introduced 1.24, the earliest it can be removed is 1.28. The minimum number of versions before deprecation is 4

Thanks, I didn't know about that.

I suspect we want to introduce the GODEBUG as defaulting to its old behavior for one release.

Yes, this makes sense to me

@podtserkovskiy
Copy link
Contributor Author

I understand this is quite a tricky issue, than you all for digging into it

@findleyr
Copy link
Contributor

findleyr commented Nov 5, 2024

Aha, this is interesting, @podtserkovskiy.

The problem is tools like cgocall from golang.org/x/tools as they depend on go/scanner and the following code fset.Position(raw.Pos()).Filename doesn't work correct due to the behaviour of go/scanner I'd like to fix.

Can you say more about this? If the line directives of the cgo files are relative, why is there an error when they are translated to absolute paths? IIUC, this translation happens inside of a call to go/parser.ParseFile, running on the local machine. Unless I'm missing something, the only way that the resulting absolute paths would be inaccurate is if the relative path to the referenced file has changed, or if the path passed to ParseFile is inaccurate (such as could be possible if the raw source were provided to ParseFile, bypassing an actual filesystem read).

@adonovan
Copy link
Member

adonovan commented Nov 5, 2024

Is it possible that the issue here is that Bazel uses a build sandbox, so when analyzing a file containing a //line directive, the directive refers to a file that is no longer accessible because it is not a direct input to the current build step? If so, then I think the real problem is cgocall's questionable assumption that it can open the file referred to by //line.

I would suggest using FileSet.PositionFor(pos, false) to ignore the //line directive, but the cgocall logic added in https://go.dev/cl/147317 is quite explicit about the need to parse the "raw" cgo file:

This change causes cgocall to parse, modify, type-check and analyze "raw"
cgo files. The approach (based on dot-importing the "cooked" package)
is rather too clever but should be more robust than the one it replaces.

Rather too clever indeed. Perhaps the solution is for cgocall to fail gracefully if it can't open the raw cgo file.

@podtserkovskiy
Copy link
Contributor Author

podtserkovskiy commented Nov 5, 2024

@findleyr

Can you say more about this? If the line directives of the cgo files are relative, why is there an error when they are translated to absolute paths?

The issue is that the relative path form //line comment is being modified here and the result of this modification is another relative (incorrect) filename that doesn't exists on the disk.

I've just published a repo with a repro on https://github.com/podtserkovskiy/issue-69689
The Readme.md contains both: the bug repro on go1.23 and the demonstration that https://go.dev/cl/618276 fixes the issue

@adonovan
Copy link
Member

adonovan commented Nov 5, 2024

Perhaps the solution is for cgocall to fail gracefully if it can't open the raw cgo file.

@podtserkovskiy, could you try applying this patch to see if it is a satisfactory solution?

diff --git a/go/analysis/passes/cgocall/cgocall.go b/go/analysis/passes/cgocall/cgocall.go
index 613583a1a..3b01cb84f 100644
--- a/go/analysis/passes/cgocall/cgocall.go
+++ b/go/analysis/passes/cgocall/cgocall.go
@@ -7,12 +7,14 @@
 package cgocall
 
 import (
+       "errors"
        "fmt"
        "go/ast"
        "go/format"
        "go/parser"
        "go/token"
        "go/types"
+       "io/fs"
        "log"
        "os"
        "strconv"
@@ -47,6 +49,9 @@ func run(pass *analysis.Pass) (interface{}, error) {
 
        cgofiles, info, err := typeCheckCgoSourceFiles(pass.Fset, pass.Pkg, pass.Files, pass.TypesInfo, pass.TypesSizes)
        if err != nil {
+               if err == errNoSource {
+                       return nil, nil
+               }
                return nil, err
        }
        for _, f := range cgofiles {
@@ -55,6 +60,8 @@ func run(pass *analysis.Pass) (interface{}, error) {
        return nil, nil
 }
 
+var errNoSource = errors.New("cgo source unavailable")
+
 func checkCgo(fset *token.FileSet, f *ast.File, info *types.Info, reportf func(token.Pos, string, ...interface{})) {
        ast.Inspect(f, func(n ast.Node) bool {
                call, ok := n.(*ast.CallExpr)
@@ -179,9 +186,18 @@ func typeCheckCgoSourceFiles(fset *token.FileSet, pkg *types.Package, files []*a
        for _, raw := range files {
                // If f is a cgo-generated file, Position reports
                // the original file, honoring //line directives.
+               //
+               // This operation is of dubious validity because it accesses
+               // files that may not be accessible when the analysis driver
+               // runs in a build sandbox as used by Bazel.
+               // So, if we can't read the source file, silently give up.
                filename := fset.Position(raw.Pos()).Filename // sic: Pos, not FileStart
-               f, err := parser.ParseFile(fset, filename, nil, parser.SkipObjectResolution)
+               f, err := parser.ParseFile(fset, filename+"bonkers", nil, parser.SkipObjectResolution)
                if err != nil {
+                       if errors.Is(err, new(fs.PathError)) {
+                               log.Printf("cgocall analyzer on package %s cannot read raw cgo source file %s (possible build sandbox?); skipping", pkg, filename)
+                               return nil, nil, errNoSource
+                       }
                        return nil, nil, fmt.Errorf("can't parse raw cgo file: %v", err)
                }
                found := false

@podtserkovskiy
Copy link
Contributor Author

could you try applying this patch to see if it is a satisfactory solution?

@adonovan This helps to silence the errors on this specific analyser, but this is not what I'd like to achieve.
I want to enable linters to work on CGo code on Go projects using Buck, specifically on Facebook codebase.

The problem is a bit more broad then cgocall analyser. Currently code issues like fmt.Sprintf("%d", "awd"), redundant returns or unhandled errors are ignored by popular linters on our codebase.

The root cause is caused by the assumption of go/scanner that prefixes any data from //line comment with the directory name passed into parser.ParseFile.
For example:

  • We have cgo_objdir/test1.cgo1.go with the comment //line src/test1.go:1:1
  • Then trying to parse it with parser.ParseFile(fset, "cgo_objdir/test1.cgo1.go", nil, parser.ParseComments)
  • And get infos[0].lineInfo.Filename = "cgo_objdir/src/test1.go"

The cgo_objdir/src/test1.go isn't valid, we should have gotten just src/test1.go instead.

Then parser.ParseFile returns parser.ParseFile the error open /workdir/cgo_objdir/src/test1.go: no such file or directory

The full example is here https://github.com/podtserkovskiy/issue-69689

The CL 618276 is definitely fixes those problems for us, so linters on CGo files work the same way as they work on normal Go files.

During my experiments I didn't find any particular example when CL 618276 break something.
So it seems like nothing depends on these lines.

Do you have any specific examples when this CL 618276 can break existing code?

I understand, the decision how to resolve relative paths is of course debatable, but the natural resolving relative to CWD seems to be consistent AFAICT across other go standard tools and less magical then the current one.

What do you think about that?

@timothy-king
Copy link
Contributor

During #69689 (comment) I didn't find any particular example when CL 618276 break something.
So it seems like nothing depends on these lines.

Do folks have thoughts on how to scale up this experiment to avoid #26671 repeating?

@findleyr
Copy link
Contributor

@timothy-king it's still not clear to me that this is the correct fix.

I understand, the decision how to resolve relative paths is of course debatable, but the natural resolving relative to CWD seems to be consistent AFAICT across other go standard tools and less magical then the current one.

I don't think this is clear. It seems to me that if the line directive says "dir/foo.go", I would expect that to be relative to the current file, because the file itself is not associated with any CWD.

I think the line directives are just wrong. Why not just run cgo with -trimpath set to something like "$(pwd)=>..", to get the correct path adjustments?

@podtserkovskiy
Copy link
Contributor Author

I think the line directives are just wrong. Why not just run cgo with -trimpath set to something like "$(pwd)=>..", to get the correct path adjustments?

@findleyr I did a quick experiment on github.com/podtserkovskiy/issue-69689
Adding -trimpath "$(pwd)=>.." helps to fix the original issue, but creates another.

$ go tool cgo -objdir cgo_objdir -trimpath "$(pwd)=>.." testdata/test1.go
$ go run main.go                                               
2024/11/19 17:48:03 Raw CGo file parsed
2024/11/19 17:48:03 //line resolved correctly

However the compiler produces errors with inconsistent paths when does typecheck if CGo files has type-errors.
Let's introduce a type error into [testdata/test1.go]

$ go tool cgo -objdir cgo_objdir -trimpath "$(pwd)=>.." testdata/test1.go
$ go tool compile -o /dev/null cgo_objdir/_cgo_gotypes.go cgo_objdir/test1.cgo1.go
cgo_objdir/_cgo_gotypes.go:7:8: could not import syscall (file not found)
cgo_objdir/_cgo_gotypes.go:9:20: could not import runtime/cgo (file not found)
../testdata/test1.go:6:9: cannot use number (variable of type _Ctype_int) as int value in return statement

i.e. the file-paths of files without //line comment are valid in the context of the directory where cmd/compile executed, but invalid for files with "//line" comments because of ".."

But with -trimpath "$(pwd) the output is correct

$ go tool cgo -objdir cgo_objdir -trimpath "$(pwd)" testdata/test1.go
$ go tool compile -o /dev/null cgo_objdir/_cgo_gotypes.go cgo_objdir/test1.cgo1.go
cgo_objdir/_cgo_gotypes.go:7:8: could not import syscall (file not found)
cgo_objdir/_cgo_gotypes.go:9:20: could not import runtime/cgo (file not found)
testdata/test1.go:6:9: cannot use number (variable of type _Ctype_int) as int value in return statement

The issue is that the compiler and go/scanner interpret //line comments differently. It would be nice if we could align their behaviour.

@podtserkovskiy
Copy link
Contributor Author

We need to clarify the way tools interpret relative names in //line to move forward.

Presumably @griesemer had a similar goal in CL 100235 as I can see from this comment

The filenames in line directives now remain untouched by the scanner;
there is no cleanup or conversion of relative into absolute paths
anymore, in sync with what the compiler's scanner/parser are doing.
Any kind of filename transformation has to be done by a client. This
makes the scanner code simpler and also more predictable.

But this part of CL 100235 has been reverted to the old behaviour by CL 127658.
Fortunately it looks like that's not the case anymore.

We have at least two options how to standardise this behaviour:

  1. Use relative filenames "as is" (what cmd/compile/internal/syntax does)
  2. Interpret relative filenames as relative to the current file (what go/scanner does)

Which one sounds better to you? Are there more options?

@adonovan
Copy link
Member

adonovan commented Nov 20, 2024

We have at least two options how to standardise this behaviour:

  1. Use relative filenames "as is" (what cmd/compile/internal/syntax does)
  2. Interpret relative filenames as relative to the current file (what go/scanner does)

The problem with option 1 is that it assumes the process's working directory is the same as what was used by -trimpath or the tool that generated the cgo file. That may be true for the compiler because of the strict environment in which it is run by go build. For example, the compiler's working directory is always GOROOT/src even when compiling, say, encoding/json. But this is not true for source-reading tools in general, which may be run from anywhere and have no way to know what directory to assume when they encounter a relative file name in a line directive.

The only directory that can be reliably used as as a base is the parent directory of the source file itself; that is, option 2.

In short, I think the problem is in GOROOT/src/cmd, not in go/scanner or x/tools.

@adonovan
Copy link
Member

I have filed a new proposal #70478 to change the compiler, and suggest that we close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants