Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Morty stack overflow #51

Open
flaviens opened this issue Oct 26, 2022 · 14 comments
Open

Morty stack overflow #51

flaviens opened this issue Oct 26, 2022 · 14 comments

Comments

@flaviens
Copy link

flaviens commented Oct 26, 2022

Hi there! Me again 😇

Morty stack overflows on some CPU code.

Output

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
[1]    30715 IOT instruction (core dumped)  /home/user/.cargo/bin/morty -f Bender.sources

How to reproduce

Tried on Ubuntu 22.04 with morty 0.8.0.

git clone https://github.com/flaviens/a2o.git
cd a2o
git checkout reprod_morty_stkovf
morty -f Bender.sources -DNCLK_WIDTH=1

Thanks!

@micprog
Copy link
Member

micprog commented Oct 26, 2022

I have encountered stack overflows with morty before, but have not had the chance to concretely debug it or look for the root cause. I believe it is due to sv-parser, and it occurs for very long systemverilog statements. Modifying the code to split these statements works, but is a cumbersome workaround. Feel free to dig into the morty and sv-parser code and find the root cause, I'd be happy to review a corresponding pull request!

@flaviens
Copy link
Author

flaviens commented Oct 26, 2022

Hi Michael, thank you for your quick response!

Having no familiarity with the implementation of Morty and sv-parser, it's unlikely that I can find the root cause if you cannot yourself unfortunately.
Therefore I'm fine with temporarily pre-processing some long systemverilog statements.
Do you have any additional information about how to proceed in this direction (which type of statements, which length, typical problems)?

Thanks!

@micprog
Copy link
Member

micprog commented Oct 26, 2022

I think the easiest way to find the culprit is to use morty's verbose mode to help figure out which file contains the issue. From my experience it is usually very long assign statements, but I haven't dug deep into the specifics.

@flaviens
Copy link
Author

flaviens commented Oct 26, 2022

Thank you for your response! I made several experiments:

  1. Morty on each file individually: this way, I do not get any stack overflow. This hampers the direct use of -verbose for debugging unfortunately, since the bug comes from combining the files.
  2. Morty on each file individually and then Morty on the output of each file. I get the stack overflow.
  3. sv2v on each file individually and then Morty on the output of each file. I get the stack overflow, despite the source files looking very different compared to experiments 1 and 2. Maybe the problematic sequence of lines (experiment 1 showed that it is not a single line) has been preserved.
    Experiment 3 could suggest that the bug does not originate from the parser, without strong evidence though.

@micprog
Copy link
Member

micprog commented Oct 26, 2022

I don't know how much I can help here. I tested your repository and, after removing mmq_spr.v due to errors (I believe begin with no end), I was able to run everything with the sequential implementation (on the branch that can sequentially propagate defines), albeit with the -i flag to bypass several failing files. To propagate defines, everything needs to be run sequentially, but the parallel implementation on the master branch seems to fail, I assume because simultaneous processing puts excessive load on the stack. If you run the sequential version with -v you should see the last file that has started parsing, so if there is an issue you should be able to detect it that way.

@flaviens
Copy link
Author

Hi again!
Sorry that it was not clear, you could get rid of the parse errors by rebasing to the fix_begin_genvar branch. I did it for you, you just need to force pull.
I tried with -v, and it still stack overflows. The problem is that it displays a different last file before bug for each run. I also noticed that the order in which morty treats files is always different.

@micprog
Copy link
Member

micprog commented Oct 26, 2022

Are you using the development version on the collect_defines branch? The master branch also shows the behavior you describe for me, AFAIK due to the parallel parsing of the files.

@zarubaf
Copy link
Collaborator

zarubaf commented Oct 26, 2022

@huettern also had the same problem with some large auto-generated Xilinx files. My assumption would be that this is due to an exceeded parse depth.

@flaviens
Copy link
Author

flaviens commented Oct 26, 2022

Thank you for your answers! Indeed with the collect_defines files it seems to work 🥳 ! However it's super slow, so not the best candidate for a master. How about a flag for parallelism?

@micprog
Copy link
Member

micprog commented Oct 26, 2022

Great that it works. Indeed it is far slower, but depending on the desired feature set it is the way to go. I doubt there is much desire to spend time speeding up parsing significantly for morty, AFAIK pickling is an infrequent task that can tolerate some latency. I plan to add some flags for configuration to enable/disable define chaining, parallelism, etc. (one of the reasons the branch is not yet ready for merging), but haven't found the time.

@flaviens
Copy link
Author

That would be great! Thanks!

@micprog
Copy link
Member

micprog commented Oct 27, 2022

Hi @flaviens,
Please check out #52 and let me know if it works for you.

@flaviens
Copy link
Author

Nice, the -q flag seems to do the job!
Why is --propagate_defines incompatible with --top, is it a conceptual mismatch or an implementation difficulty?

@micprog
Copy link
Member

micprog commented Oct 27, 2022

Glad to hear that.
--top removes files that are not needed for the specified top module, while --propagate_defines directly chains the parsed files due to included defines. Morty currently does not directly investigate all internal constructs, only modules, packages, and interfaces, so global typedefs (as you were using in #49) or other top-level elements may not be handled correctly. Arguably the incompatibility could result in a warning instead of ignoring the top module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants