Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of parser stack items makes parsing slow #10

Open
ExpHP opened this issue Feb 14, 2021 · 4 comments
Open

Size of parser stack items makes parsing slow #10

ExpHP opened this issue Feb 14, 2021 · 4 comments
Labels
performance Something is slow af

Comments

@ExpHP
Copy link
Owner

ExpHP commented Feb 14, 2021

One item that appears on benchmarks (currently measuring about 8% on anm-benchmark for th15/title.anm) is memcpy calls in lalrparser::__parse__Anything::__reduce:

image

After some digging in binja and CE, it appears that the majority of memcpy samples counted were likely in these calls to __symbols.pop():

image

These (Location, Symbol, Location) tuples are currently 256 bytes large. __Symbol is a union of all kinds of nonterminal:

pub enum __Symbol<'input>
     {
        Variant0(Token<'input>),
        Variant1(&'input str),
        Variant2(::std::option::Option<Token<'input>>),
        Variant3(Sp<i32>),
        Variant4(::std::option::Option<Sp<i32>>),
        Variant5(()),
        ...
        Variant13(Sp<crate::parse::AnythingValue>),  // <-- largest variant
        ...
        Variant85(ast::Var),
        Variant86(ast::VarDeclKeyword),
    }

It's 256 bytes because Stmts are currently 208 bytes. (then there's the 8-byte discriminant in AnythingValue, the 16 bytes added by Sp<...>, another 8 bytes for __Symbol's discriminant, then 8 bytes for each Location).

We could probably speed things up a little bit by making Stmt smaller. I'm not sure how much we can do effectively short term aside from boxing some things, which has it's own cost; in the long term, arenas may help. Also, It would be fantabulous if we could figure out how to fix Spans to only be 8 bytes instead of an awkwardly-padded 12.

More importantly, in the meanwhile.... we want to be careful about letting Stmt grow any bigger.

@ExpHP ExpHP added the performance Something is slow af label Feb 14, 2021
@ExpHP
Copy link
Owner Author

ExpHP commented Nov 14, 2021

Current status:

  • Sp<Stmt> is currently 320 bytes
  • If I use black magic and sorcery to make Span and Option<Span> both 8 bytes, the size reduces to 256.

This 25% size reduction equated to a 10% runtime reduction, when running on all 7 EoSD ECL files. (note: WSL 2, files stored on NTFS)

This "black magic and sorcery" actually makes it quite difficult to reason about things. I might consider sticking some boxes into Stmt instead...

@ExpHP
Copy link
Owner Author

ExpHP commented Nov 14, 2021

Boxing Stmt did not help.

I made all Stmt-producing productions return Box<ast::Stmt> or Box<ast::StmtKind> instead and dereferenced each when inserted into an AST node. I copied the symbol enum from generated.rs and verified that the size was now reduced to 192 bytes.

Unfortunately the runtime performance did not change from what it is currently.

Doing this required me to write this travesty in place of what used to be <stmts:(Sp<Stmt>)*>:

#[inline]
SpStmts0: Vec<Sp<ast::Stmt>> = {
    => vec![],
    SpStmts1,
};

SpStmts1: Vec<Sp<ast::Stmt>> = {
    <stmt:Sp<BoxStmt>> => vec![sp!(stmt.span => *stmt.value)],
    <stmts:SpStmts1> <stmt:Sp<BoxStmt>> => util::push(stmts, sp!(stmt.span => *stmt.value)),
};

An aside: It looks like inline rules in the grammar still generate states that are placed on the stack.

I discovered this when I tried to clean up the above code using the following so that I could write <stmts:(Sp<Unbox<BoxStmt>>)*>:

// NOTE:  pub type UnboxType<T> = <T as core::ops::Deref>::Target;
#[inline]
Unbox<Rule>: util::UnboxType<Rule> = <r:Rule> => *r;

Unfortunately despite the #[inline] this still produces a variant:

Variant102(util::UnboxType<Box<ast::Stmt>>),

@ExpHP
Copy link
Owner Author

ExpHP commented Nov 14, 2021

From with the boxed Stmts.

image

Doesn't look like a significant contribution from allocation, it's still all memcpy...

@ExpHP
Copy link
Owner Author

ExpHP commented Nov 14, 2021

I tried replacing the boxes with a qcell-based Pool abstraction with a tiny fixed-size pool of two reusable Option<ast::Stmt> and Option<ast::StmtKind>s. The statements in the symbol enum effectively became &'a LCell<Option<T>>; pointer-sized values that could be created from (and returned into) T without any allocation or synchronization.

This still had no effect.

This seems paradoxical. It would leads me to believe that it is perhaps not the size of the Symbol enum that is the problem, but rather the size of the ast structs in general (including Stmt). But, were that the case, then the loop desugaring/decompilation passes should show up much more on here...

I don't know.

Relavant branches:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Something is slow af
Projects
None yet
Development

No branches or pull requests

1 participant