ANTLR memory allocations look excessive? #3765

SimonSntPeter · 2022-06-27T16:20:07Z

SimonSntPeter
Jun 27, 2022

For a tiny little parsing input in C# I noticed it took a full second to start up. Using the profiler I looked at the memory allocated and found that it seems to be doing an gigantic amount of object generation, which then has to be cleaned up by the GC.

Here is my input, 413 bytes of it:

{
    LDB_run_internal_self_tests := false;
--    LDB_run_internal_self_tests := true;
};

create table not null orders
(
    onum int pkey,
    oname varchar(30),
    cnum int
);

select onum, count(oname) as nn
from orders
where onum = 23
group by onum
having count(oname) = 999
;

select onum, count(oname) as nn
from orders
where onum = 123
group by onum
having count(oname) = 888
;

Attached is an image of the profiler's output. The top graph I guess is the allocation nursery, and you can see what appeared to be five garbage collections in just that. I don't know what the bottom graph is but it's going to 48 meg.

Below that are the number of allocations from antlr, I have no idea how it's possible it could be doing this much.

A while back I did some profiling with the Python output, and tentatively concluded that the slowdown was something to do with the number of objects allocated, which makes sense because Python uses a skanky expensive reference counting GC and a fallback mark-sweep if necessary for cycles, whereas Java and C# seem to have much more efficient generational collectors.

I don't have time to dig into this, but I would like other people's opinion. Can anyone reproduce this?

Please note that the stuff I'm parsing actually produces an AST inside and a few other bits and pieces, but if you look at the breakdown in the attached image, you'll see that it really is utterly overwhelmed by antler-related objects.

Keep in mind I may be doing something wrong, but given that, any thoughts?

ja

n

KvanTTT · 2022-06-27T18:27:14Z

KvanTTT
Jun 27, 2022

I am afraid this data can not be shrunk a lot, because ATN is a static representation of grammar. It's encoded into generated code, it's deserialized on loading, and it's used during lexing/parsing.

But it can be shrunk moderately if use ATN optimizations, for instance, this one: #3676

0 replies

SimonSntPeter · 2022-06-27T19:25:36Z

SimonSntPeter
Jun 27, 2022
Author

Hi, thanks for this.
It does leave me with a number of questions below:

Because ATN is a static representation of grammar

I thought antler was recursive descent, and that C# output was really the encoding of the grammar? (I should know more about ALL*, I don't).

Putting that aside, that explains part of what I saw, which is the one part (ATN transition it might have been) didn't really grow when you had more input to parse, but other stuff did. For example, the very topmost item, ATN config drastically increased. I'll have to do some more tests tomorrow and post evidence, not tonight.

But it also fails to explain other things for example ATN.Transation enumerator being allocated in huge numbers, and nearly 20,000 sharpen.bitset allocations? That just makes no sense to me.

So, a question, is it possible to reuse a parser once constructed without throwing it away and recreating it from scratch? For example for me the code looks like this:

public static (LDBParser, ErrorListener<IToken>) makeParser(string inputStr) {
    var str = new AntlrInputStream(inputStr);
    var lexer = new LDBLexer(str);
    var tokens = new CommonTokenStream(lexer);
    var parser = new LDBParser(tokens);
    var errListener = new ErrorListener<IToken>(parser, lexer, tokens);
    parser.AddErrorListener(errListener);
    parser.AddErrorListener(new DiagnosticErrorListener(false));
    lexer.AddErrorListener(new ErrorListener<int>(parser, lexer, tokens));
    parser.Interpreter.PredictionMode = PredictionMode.LL_EXACT_AMBIG_DETECTION;

    return (parser, errListener);
}

and I do that every time. If it's so expensive than I'd rather recycle than re-create from scratch - a one second startup if you're doing huge numbers of these will be crippling.

I'll look into this tomorrow and try and give further breakdown.

jan

1 reply

KvanTTT Jun 27, 2022

Putting that aside, that explains part of what I saw, which is the one part (ATN transition it might have been) didn't really grow when you had more input to parse, but other stuff did. For example, the very topmost item, ATN config drastically increased. I'll have to do some more tests tomorrow and post evidence, not tonight.

Yes, another part is dynamic and it's being allocated more and more during program running. Responsible classes are located here: https://github.com/antlr/antlr4/tree/dev/runtime/Java/src/org/antlr/v4/runtime/dfa
You can reset the cache by the way described on StackOverflow: https://stackoverflow.com/a/28843647/1046374.

So, a question, is it possible to reuse a parser once constructed without throwing it away and recreating it from scratch? For example for me the code looks like this:

Dynamic info should be cached anyway. I don't know why it's being recreated on every construction. Maybe because of LL_EXACT_AMBIG_DETECTION mode, I've never used it.

kaby76 · 2022-06-28T12:02:39Z

kaby76
Jun 28, 2022

ATNConfig() constructors are not called during deserialization. But I doubt deserialization is at all a performance problem. For plsql--the largest grammar we have with many thousands of rules--deserialization takes ~0.05s for each atn, lexer and parser (Ryzen 7 2700 standard clock speed, 16GB DDR4, SSD), out of ~2s (plsql/examples/aggregate01.sql). I would first eliminate grammar issues. Check the ambiguity with Intellij on the small input. For example, for plsql, there's a max k of 26!! associated with unary_expression, and lots of ambiguity in atom. Nothing good can happen with numbers like these. That's why it takes 2s to parse a 73-line example. (The Intellij IDE also had a fatal error, but that's another issue.)

Remember: Antlr gives one a lot of rope to hang yourself. Everything starts with a quality grammar.

0 replies

KvanTTT · 2022-06-30T17:50:56Z

KvanTTT
Jun 30, 2022

BTW, the similar issue is existed: #821

1 reply

parrt Jul 2, 2022
Maintainer

There is also the discussion of freeze drying the ATN state so we can't be reloaded, to avoid the start up cost. Don't have a link handy.

SimonSntPeter · 2022-06-30T18:06:30Z

SimonSntPeter
Jun 30, 2022
Author

Excuse delay in responding, just had my first covid.

Will skip gathering data and do as Ken Domino said and check lookahead. Also thanks to Ivan Kochurkin, will check out links. Will report back if I find anything noteworthy.

0 replies

SimonSntPeter · 2022-07-02T17:57:17Z

SimonSntPeter
Jul 2, 2022
Author

I recall this being mentioned. I think Ken Domino's point is likely the answer, that I've just written myself a very suboptimal grammar and should look at that first. jan

…

On 02/07/2022 18:25, Terence Parr wrote: There is also the discussion of freeze drying the ATN state so we can't be reloaded, to avoid the start up cost. Don't have a link handy. — Reply to this email directly, view it on GitHub <#3765 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD4SBWN6VYH6HO5CUIS65XDVSB3R5ANCNFSM5Z6XWCKA>. You are receiving this because you authored the thread.Message ID: ***@***.***>

10 replies

KvanTTT Jul 5, 2022

BTW, some issues should be considered during consolidation. For instance, we've had a case where ? operator cannot be applied since it affects parsing behavior:

structDeclaration
    :   specifierQualifierList structDeclaratorList ';'
    :   specifierQualifierList ';'
    |   staticAssertDeclaration
    ;

kaby76 Jul 6, 2022

structDeclaration // The first two rules have priority order and cannot be simplified to one expression.
    :   specifierQualifierList structDeclaratorList ';'
    |   specifierQualifierList ';'
    |   staticAssertDeclaration
    ;

vs

structDeclaration
    :   specifierQualifierList structDeclaratorList? ';'
    |   staticAssertDeclaration
    ;

Input struct A { int x;};

We need to be careful in refactoring with ambiguous parses. This string has two different derivations. Trash can do this refactoring, which is a grouping between the first and second alts. It's implemented using an N-way diff among all or some of the alts of a rule.

But, in this specific example, this is what happens when one does not follow the Spec, struct-declaration : specifier-qualifier-list struct-declarator-list ;. It was in the initial version of the grammar.

parrt Jul 6, 2022
Maintainer

also note that ambiguity is undecidable statically.

kaby76 Jul 6, 2022

also note that ambiguity is undecidable statically.

Of course. But, given a finite set of inputs, ambiguity is decidable for the grammar under that set of strings. I think we can assume that we have a large test suite that would exercise the ambiguity.

KvanTTT Jul 9, 2022

I don't completely understand how ? operator works. Intuitively, b? should be treated either as (b | ) or ( | b). Consider the following grammar:

test1
    : b? c
    ;

test31
    : (b | ) c
    ;

test32
    : ( | b) c
    ;

b: ID;
c: ID+;
ID: [a-z]+;
WS: []+ -> skip;

Rules test and test31 return the following tree:

test32 returns the following:

Ok, let b? is treated as (b | ). In this case, the rule

structDeclaration
    :   specifierQualifierList structDeclaratorList? ';'
    |   staticAssertDeclaration
    ;

should be equivalent to

structDeclaration
    :   specifierQualifierList structDeclaratorList ';'
    :   specifierQualifierList ';'
    |   staticAssertDeclaration
    ;

And deoptimization is not required. But it's not. It looks like ? operator works inconsistently.

NagyGa1 · 2023-03-22T23:45:49Z

NagyGa1
Mar 22, 2023

We see a similar pattern on the c++ runtime. Here is a recent profile:

It is ~8500 total samples, out of which (613+530+870+615+356)=2984 is memory allocation / deallocation / dynamic cast.

Probably not easy to move the existing code to use the stack instead of the heap, or some fixed structure, but would give a big boost.

The dynamic cast normally can be replaced with virtual methods (which still have one virtual table lookup in the background compared to non-virtuals).

These were small trees with 1-3 leaves.

4 replies

jimidle Mar 23, 2023

What grammar are you using?

NagyGa1 Mar 23, 2023

Sorry, I was wrong, here is the grammar, probably far from optimal. We probably tried to use the wrong tool for the job as well.

grammar NumberValue;

start: (
		date
		| time_value
		| date_time
		| float_
		| int_
		| currency
		| exp
		| percentage
	) EOF;

date_time: date time_value;

date: (
		(INT ('-' | '/') INT ('-' | '/') INT)
		| (INT ('-' | '/') INT)
		| (INT STRING INT)
		| (STRING INT)
		| (INT STRING)
		| (INT '-' STRING '-' INT)
	);

time_value: ( (INT ':' INT ':' (FLOAT | INT)) | (INT ':' INT));

float_: (FLOAT | LPAREN FLOAT RPAREN | '-' FLOAT);

int_: ('-'? INT | LPAREN INT RPAREN); //('-'? INT | LPAREN INT RPAREN);

currency: CURRENCY (INT | FLOAT) (COMMA INT)* (PERIOD INT+)?;

exp: ('-'? (INT | FLOAT) EXP ('-'? INT));

percentage: P_CENT* ('-'? INT |'-'? FLOAT) P_CENT+;


EXP: 'e' | 'E';
CURRENCY: '$' | '€'; //rem '¥'
P_CENT: '%';
COMMA: ',';
PERIOD: '.';
COLON: ':';
LPAREN: '(';
RPAREN: ')';
INT: [0-9]+;
STRING: [A-Za-z]+;
FLOAT: INT PERIOD INT+;
WS: [ \t\r\n]+ -> skip;

kaby76 Mar 23, 2023

Why is any of this written in parser rules? It should all be in the lexer.

NagyGa1 Mar 23, 2023

Yes, noted. The feedback I'm trying to make here is that the current c++-runtime could be a lot faster without new SomeClass() and the likes.

jimidle · 2023-03-23T05:42:58Z

jimidle
Mar 23, 2023

Well unless you’re parsing sone large input, then all measurements will be dominated by initialization.

…

On Thu, Mar 23, 2023 at 13:40 Gabor Nagy ***@***.***> wrote: Something very simple, just an int and a float in it. — Reply to this email directly, view it on GitHub <#3765 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJ7TMCD6322UV3GUEGEMNLW5PO3VANCNFSM5Z6XWCKA> . You are receiving this because you commented.Message ID: ***@***.***>

1 reply

NagyGa1 Mar 23, 2023

Yes, noticed that all results seem to have a 2000ns baseline.

jimidle · 2023-03-24T01:35:23Z

jimidle
Mar 24, 2023

Are you reusing the parser and lexer?

…

On Thu, Mar 23, 2023 at 13:48 Gabor Nagy ***@***.***> wrote: Yes, noticed that all results seem to have a 2000ns baseline. — Reply to this email directly, view it on GitHub <#3765 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJ7TMFTECV5FF53K2K4LPTW5PP4DANCNFSM5Z6XWCKA> . You are receiving this because you commented.Message ID: ***@***.***>

8 replies

NagyGa1 Mar 24, 2023

Thank you. Couldn't figure how to switch the c++-runtime into SLL mode, if you could send me to the right direction?

jimidle Mar 24, 2023

I think this is what you need:

myparser = new MyParser(<tokensource>)
myparser->getInterpreteratn::ParserATNSimulator()
->setPredictionMode(atn::PredictionMode::SLL);

After that first new, you can just set a new token stream on it. This will mean you don't deserialize the ATN every time you parse

jimidle · 2023-03-24T08:48:52Z

jimidle
Mar 24, 2023

You’re welcome

…

On Fri, Mar 24, 2023 at 11:18 Gabor Nagy ***@***.***> wrote: The overhead came down from ~2000ns to ~1500ns, which is important for us even if we use Antlr4 only on larger grammars / more complexity. SLL did not produce an eyeball-visible difference in this case. Thank you, was very helpful. — Reply to this email directly, view it on GitHub <#3765 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJ7TMHIZXWWOTYG7723W43W5UG7NANCNFSM5Z6XWCKA> . You are receiving this because you commented.Message ID: ***@***.***>

0 replies

dhaumann · 2024-01-26T15:48:04Z

dhaumann
Jan 26, 2024

For the C++ runtime, I juts opened issue #4518, maybe that's interesting to some readers here.

0 replies

ANTLR memory allocations look excessive? #3765

Replies: 11 comments · 25 replies

SimonSntPeter Jun 27, 2022 Author

parrt Jul 2, 2022 Maintainer

SimonSntPeter Jun 30, 2022 Author

SimonSntPeter Jul 2, 2022 Author

parrt Jul 6, 2022 Maintainer

Replies: 11 comments 25 replies

SimonSntPeter
Jun 27, 2022
Author

parrt Jul 2, 2022
Maintainer

SimonSntPeter
Jun 30, 2022
Author

SimonSntPeter
Jul 2, 2022
Author

parrt Jul 6, 2022
Maintainer