Skip to content

Commit

Permalink
Relation CSE (MaterializeInc#7715)
Browse files Browse the repository at this point in the history
* rebase

* More tests updated for viewing

* reorganize logic into Bindings struct

* typos

* additional tests for Common Subexpression Elimination (CSE)

Co-authored-by: Philip Stoev <[email protected]>
  • Loading branch information
frankmcsherry and philip-stoev authored Aug 13, 2021
1 parent 0a0483d commit c793609
Show file tree
Hide file tree
Showing 13 changed files with 2,068 additions and 472 deletions.
1 change: 1 addition & 0 deletions src/transform/src/cse/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
//! Common subexpression elimination.
pub mod map;
pub mod relation_cse;
138 changes: 138 additions & 0 deletions src/transform/src/cse/relation_cse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
// Copyright Materialize, Inc. and contributors. All rights reserved.
//
// Use of this software is governed by the Business Source License
// included in the LICENSE file.
//
// As of the Change Date specified in that file, in accordance with
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0.

//! Identifies common relation subexpressions and places them behind `Let` bindings.
//!
//! All structurally equivalent expressions, defined recursively as having structurally
//! equivalent inputs, and identical parameters, will be placed behind `Let` bindings.
//! The resulting expressions likely have an excess of `Let` expressions, and should be
//! subjected to the `InlineLet` transformation to remove those that are not necessary.
use std::collections::HashMap;

use expr::{Id, LocalId, MirRelationExpr};

use crate::TransformArgs;

/// Identifies common relation subexpressions and places them behind `Let` bindings.
#[derive(Debug)]
pub struct RelationCSE;

impl crate::Transform for RelationCSE {
fn transform(
&self,
relation: &mut MirRelationExpr,
_: TransformArgs,
) -> Result<(), crate::TransformError> {
let mut bindings = Bindings::default();
bindings.intern_expression(relation);
bindings.populate_expression(relation);
Ok(())
}
}

/// Maintains `Let` bindings in a compact, explicit representation.
///
/// The `bindings` map contains neither `Let` bindings nor two structurally
/// equivalent expressions.
///
/// The bindings can be interpreted as an ordered sequence of let bindings,
/// ordered by their identifier, that should be applied in order before the
/// use of the expression from which they have been extracted.
#[derive(Debug, Default)]
pub struct Bindings {
/// A list of let-bound expressions and their order / identifier.
bindings: HashMap<MirRelationExpr, u64>,
/// Mapping from conventional local `Get` identifiers to new ones.
rebindings: HashMap<LocalId, LocalId>,
}

impl Bindings {
/// Replace `relation` with an equivalent `Get` expression referencing a location in `bindings`.
///
/// The algorithm performs a post-order traversal of the expression tree, binding each distinct
/// expression to a new local identifier. It maintains the invariant that `bindings` contains no
/// `Let` expressions, nor any two structurally equivalent expressions.
///
/// Once each sub-expression is replaced by a canonical `Get` expression, each expression is also
/// in a canonical representation, which is used to check for prior instances and drives re-use.
fn intern_expression(&mut self, relation: &mut MirRelationExpr) {
match relation {
MirRelationExpr::Let { id, value, body } => {
self.intern_expression(value);
let new_id = if let MirRelationExpr::Get {
id: Id::Local(x), ..
} = **value
{
x
} else {
panic!("Invariant violated")
};
self.rebindings.insert(*id, new_id);
self.intern_expression(body);
let body = body.take_dangerous();
self.rebindings.remove(id);
*relation = body;
}
MirRelationExpr::Get { id, .. } => {
if let Id::Local(id) = id {
*id = self.rebindings[id];
}
}

_ => {
// All other expressions just need to apply the logic recursively.
relation.visit1_mut(&mut |expr| {
self.intern_expression(expr);
})
}
};

// This should be fast, as it depends directly on only `Get` expressions.
let typ = relation.typ();
// We want to maintain the invariant that `relation` ends up as a local `Get`.
if let MirRelationExpr::Get {
id: Id::Local(_), ..
} = relation
{
// Do nothing, as the expression is already a local `Get` expression.
} else {
// Either find an instance of `relation` or insert this one.
let bindings_len = self.bindings.len() as u64;
let id = self
.bindings
.entry(relation.take_dangerous())
.or_insert(bindings_len);
*relation = MirRelationExpr::Get {
id: Id::Local(LocalId::new(*id)),
typ,
}
}
}

/// Populates `expression` with necessary `Let` bindings.
///
/// This population may result in substantially more `Let` bindings that one
/// might expect. It is very appropriate to run the `InlineLet` transformation
/// afterwards to remove `Let` bindings that it deems unhelpful.
fn populate_expression(self, expression: &mut MirRelationExpr) {
// Convert the bindings in to a sequence, by the local identifier.
let mut bindings = self.bindings.into_iter().collect::<Vec<_>>();
bindings.sort_by_key(|(_, i)| *i);

for (value, index) in bindings.into_iter().rev() {
let new_expression = MirRelationExpr::Let {
id: LocalId::new(index),
value: Box::new(value),
body: Box::new(expression.take_dangerous()),
};
*expression = new_expression;
}
}
}
10 changes: 10 additions & 0 deletions src/transform/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,11 @@ impl Optimizer {
// `Map {Cross Join (Input, Constant()), Literal}`.
// Join fusion will clean this up to `Map{Input, Literal}`
Box::new(crate::map_lifting::LiteralLifting),
// Identifies common relation subexpressions.
// Must be followed by let inlining, to keep under control.
Box::new(crate::cse::relation_cse::RelationCSE),
Box::new(crate::inline_let::InlineLet),
Box::new(crate::update_let::UpdateLet),
Box::new(crate::FuseAndCollapse::default()),
],
}),
Expand Down Expand Up @@ -306,6 +311,11 @@ impl Optimizer {
Box::new(crate::projection_lifting::ProjectionLifting),
Box::new(crate::join_implementation::JoinImplementation),
Box::new(crate::fusion::project::Project),
// Identifies common relation subexpressions.
// Must be followed by let inlining, to keep under control.
Box::new(crate::cse::relation_cse::RelationCSE),
Box::new(crate::inline_let::InlineLet),
Box::new(crate::update_let::UpdateLet),
Box::new(crate::reduction::FoldConstants { limit: Some(10000) }),
];
let mut optimizer = Self::for_view();
Expand Down
11 changes: 5 additions & 6 deletions src/transform/tests/testdata/join-implementation
Original file line number Diff line number Diff line change
Expand Up @@ -66,18 +66,17 @@ opt
(join [(get x) (get x)] [[5 #0 5 #3]])
----
----
%0 =
%0 = Let l0 =
| Get x (u0)
| Filter (#0 = 5)
| ArrangeBy ()

%1 =
| Get x (u0)
| Filter (#0 = 5)
| Get %0 (l0)
| ArrangeBy ()

%2 =
| Join %0 %1
| | implementation = Differential %1 %0.()
| Join %1 %0
| | implementation = Differential %0 %1.()
| | demand = (#0..#5)
----
----
Expand Down
38 changes: 37 additions & 1 deletion src/transform/tests/testdata/keys
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,43 @@ Applied Fixpoint { transforms: [FuseAndCollapse { transforms: [ProjectionExtract
| | keys = ((#0), (#1))

====
No change: Fixpoint { transforms: [PredicatePushdown, NonNullable, ColumnKnowledge, Demand, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ReductionPushdown, ReduceElision, LiteralLifting, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ProjectionLifting, JoinImplementation, ColumnKnowledge, FoldConstants { limit: Some(10000) }, Filter, Demand, LiteralLifting, Map], limit: 100 }, ReductionPushdown, Map, ProjectionLifting, JoinImplementation, Project, FoldConstants { limit: Some(10000) }
No change: Fixpoint { transforms: [PredicatePushdown, NonNullable, ColumnKnowledge, Demand, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ReductionPushdown, ReduceElision, LiteralLifting, RelationCSE, InlineLet, UpdateLet, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ProjectionLifting, JoinImplementation, ColumnKnowledge, FoldConstants { limit: Some(10000) }, Filter, Demand, LiteralLifting, Map], limit: 100 }, ReductionPushdown, Map, ProjectionLifting, JoinImplementation, Project
====
Applied RelationCSE:
%0 = Let l0 =
| Get x (u0)
| | types = (Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))

%1 = Let l1 =
| Get %0 (l0)
| | types = (Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))
| Project (#0..#2, #0..#2)
| | types = (Int32?, Int64?, Int32?, Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))

%2 =
| Get %1 (l1)
| | types = (Int32?, Int64?, Int32?, Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))
| | types = (Int32?, Int64?, Int32?, Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))
| | types = (Int32?, Int64?, Int32?, Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))

====
Applied InlineLet:
%0 =
| Get x (u0)
| | types = (Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))
| Project (#0..#2, #0..#2)
| | types = (Int32?, Int64?, Int32?, Int32?, Int64?, Int32?)
| | keys = ((#0), (#1))

====
No change: UpdateLet, FoldConstants { limit: Some(10000) }
====
Final:
%0 =
Expand Down
14 changes: 8 additions & 6 deletions src/transform/tests/testdata/lifting
Original file line number Diff line number Diff line change
Expand Up @@ -352,17 +352,19 @@ opt
[#2 #1])])
----
----
%0 =
%0 = Let l0 =
| Get y (u1)
| Map 1
| Project (#2, #0)

%1 =
| Get y (u1)
| Map 1
| Project (#2, #1)
| Get %0 (l0)
| Project (#2, #0)

%2 =
| Union %0 %1
| Get %0 (l0)
| Project (#2, #1)

%3 =
| Union %1 %2
----
----
52 changes: 47 additions & 5 deletions src/transform/tests/testdata/steps
Original file line number Diff line number Diff line change
Expand Up @@ -30,19 +30,61 @@ steps
| Union %0 %1

====
No change: TopKElision, NonNullRequirements, Fixpoint { transforms: [FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [PredicatePushdown, NonNullable, ColumnKnowledge, Demand, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ReductionPushdown, ReduceElision, LiteralLifting, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [ProjectionLifting, JoinImplementation, ColumnKnowledge, FoldConstants { limit: Some(10000) }, Filter, Demand, LiteralLifting, Map], limit: 100 }, ReductionPushdown, Map, ProjectionLifting, JoinImplementation, Project, FoldConstants { limit: Some(10000) }
No change: TopKElision, NonNullRequirements, Fixpoint { transforms: [FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }, Fixpoint { transforms: [PredicatePushdown, NonNullable, ColumnKnowledge, Demand, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }
====
Final:
%0 =
Applied Fixpoint { transforms: [ReductionPushdown, ReduceElision, LiteralLifting, RelationCSE, InlineLet, UpdateLet, FuseAndCollapse { transforms: [ProjectionExtraction, ProjectionLifting, Map, Filter, Project, Join, InlineLet, Reduce, Union, UnionBranchCancellation, UpdateLet, RedundantJoin, FoldConstants { limit: Some(10000) }] }], limit: 100 }:
%0 = Let l0 =
| Get x (u0)
| Filter #0

%1 =
| Union %0 %0

====
No change: Fixpoint { transforms: [ProjectionLifting, JoinImplementation, ColumnKnowledge, FoldConstants { limit: Some(10000) }, Filter, Demand, LiteralLifting, Map], limit: 100 }, ReductionPushdown, Map, ProjectionLifting, JoinImplementation, Project
====
Applied RelationCSE:
%0 = Let l0 =
| Get x (u0)

%1 = Let l1 =
| Get %0 (l0)
| Filter #0

%2 =
| Union %0 %1
%2 = Let l2 =
| Union %1 %1

%3 =
| Get %2 (l2)

====
Applied InlineLet:
%0 = Let l1 =
| Get x (u0)
| Filter #0

%1 =
| Union %0 %0

====
Applied UpdateLet:
%0 = Let l0 =
| Get x (u0)
| Filter #0

%1 =
| Union %0 %0

====
No change: FoldConstants { limit: Some(10000) }
====
Final:
%0 = Let l0 =
| Get x (u0)
| Filter #0

%1 =
| Union %0 %0

====
----
Expand Down
Loading

0 comments on commit c793609

Please sign in to comment.