Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Revamp lowering of IndexDerivatives #2208

Merged
merged 79 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
38f50fa
compiler: Emulate e.find(type) with (faster) search(e, type)
FabioLuporini Apr 28, 2023
3c2ceee
compiler: Add IterationSpace.prefix
FabioLuporini May 3, 2023
ad3287a
compiler: Use internal repr for IndexSums
FabioLuporini May 3, 2023
f5be826
compiler: Add Bunch.__repr__
FabioLuporini May 3, 2023
1e88a01
compiler: Add ClusterGroup.rebuild
FabioLuporini May 4, 2023
95f51b6
compiler: Add ClusterGroup.properties
FabioLuporini May 5, 2023
020ad7d
comiler: Add minmax_index()
FabioLuporini May 12, 2023
ebcee2b
compiler: Add DAG.all_predecessors
FabioLuporini Jun 9, 2023
866a66c
tools: Add DAG.find_paths
FabioLuporini Jun 14, 2023
1db334e
compiler: Generalize AffineIndexAccessFunction
FabioLuporini Jun 15, 2023
023da18
compiler: Fix DefFunction printing
FabioLuporini Jun 20, 2023
34e30f5
compiler: aliases.Candidate -> ir.ExprGeometry
FabioLuporini Jun 22, 2023
26dbac2
compiler: Generalize and enhance ExprGeometry
FabioLuporini Jun 26, 2023
62b341c
compiler: Tweak minmax_index
FabioLuporini Jun 27, 2023
ed576e5
compiler: Add and_smart for guards auto-simplification
FabioLuporini Jul 4, 2023
8e89323
compiler: Improve Cluster.is_dense
FabioLuporini Jul 18, 2023
8e95763
compiler: Add IndexDerivative.base
FabioLuporini Jul 18, 2023
f1e25ed
compiler: Patch AffineIndexAccessFunction
FabioLuporini Jul 19, 2023
5188194
compiler: Support 2-pass impls w unexpasion
FabioLuporini Jul 18, 2023
8c38c88
compiler: Improve profiling of multipass implementations
FabioLuporini Jul 20, 2023
60dee8c
compiler: Patch sync_sections
FabioLuporini Jul 21, 2023
4a22575
compiler: Enhance CireIndexDerivatives
FabioLuporini Jul 21, 2023
cfc926c
compiler: Patch has_data_reuse to account for StencilDimension
FabioLuporini Jul 24, 2023
94cdb40
pep8 happiness
FabioLuporini Jul 24, 2023
c7f6605
compiler: Patch infer_dtype to support vector types
FabioLuporini Jul 27, 2023
45d9ae7
compiler: Fix DDA involving ComponentAccesses
FabioLuporini Jul 28, 2023
952dcf3
api: Support pattern-matching par-tile
FabioLuporini Jul 31, 2023
b453a16
compiler: Patch minimize_symbols for parlang backends
FabioLuporini Jul 31, 2023
01ff19f
compiler: Expand along SteppingDimensions
FabioLuporini Aug 2, 2023
b8c3c4b
compiler: Update behavior of ClusterGroup.syncs
FabioLuporini Aug 2, 2023
ce4d407
compiler: Add and exploit properties.is_parallel_atomic
FabioLuporini Aug 2, 2023
066236b
compiler: Enhance DDA across IndexDerivatives
FabioLuporini Aug 2, 2023
bc64af5
compiler: Tidy up utilities
FabioLuporini Aug 4, 2023
5139358
compiler: Make IndexDerivatives homogeneous irrespective of matvec
FabioLuporini Aug 22, 2023
9936002
compiler: Fix 2-pass implementations with expand=False
FabioLuporini Aug 23, 2023
dc83961
compiler: Tweak aliases selection
FabioLuporini Aug 25, 2023
58a9bc3
compiler: Patch CireIndexDerivatives
FabioLuporini Aug 25, 2023
c684a44
compiler: Add DAG.roots
FabioLuporini Aug 29, 2023
ee0f3ed
compiler: Remame CIRE search/compose funcs
FabioLuporini Aug 29, 2023
3f9a7be
compiler: Tweak cire-schedule behavior
FabioLuporini Aug 30, 2023
1f3b470
compiler: Improve DAG
FabioLuporini Aug 30, 2023
4399017
compiler: Make collect_derivatives stable
FabioLuporini Aug 30, 2023
7b7dad8
compiler: Tweak group aliases detection
FabioLuporini Aug 31, 2023
14167aa
compiler: Make Bunch iterable
FabioLuporini Sep 4, 2023
42d6f46
compiler: Drop .find where possible
FabioLuporini Apr 20, 2023
ed0a6c5
examples: Update expected output
FabioLuporini Sep 6, 2023
132ac87
compiler: Fix deterministic codegen
FabioLuporini Sep 6, 2023
edd3c91
compiler: Patch Expression.__repr__
FabioLuporini Sep 7, 2023
bcf6015
compiler: Restore previous cost model for aliases retention
FabioLuporini Sep 7, 2023
a465e29
compiler: Polishing
FabioLuporini Sep 7, 2023
8704282
compiler: Fix AbstractSymbol hashing
FabioLuporini Sep 7, 2023
a8f9e49
compiler: Avoid redundancies within relations
FabioLuporini Sep 7, 2023
16af228
compiler: Fix AbstractSymbol.__eq__
FabioLuporini Sep 8, 2023
35053d9
compiler: Relax Dimension._hashable_content
FabioLuporini Sep 9, 2023
dfbe169
compiler: Patch MPINeighborhood reconstruction
FabioLuporini Sep 11, 2023
f142b32
compiler: Add TBArray
FabioLuporini Sep 12, 2023
957000b
compiler: Speedup codegen
FabioLuporini Sep 8, 2023
1c5ad3d
compiler: Speedup codegen by minimizing relations
FabioLuporini Sep 8, 2023
019fd4f
compiler: Speedup codegen by minimizing relations
FabioLuporini Sep 11, 2023
7c47a28
compiler: Speedup filter_ordered
FabioLuporini Sep 11, 2023
2a81f6e
compiler: Speedup Dimension.__eq__
FabioLuporini Sep 11, 2023
232bda7
compiler: Speedup DDA by introducing null_ispace
FabioLuporini Sep 11, 2023
f069244
compiler: Speedup IntervalGroup.reorder
FabioLuporini Sep 11, 2023
e83fed3
compiler: Speedup DDA through lazy TimedAccess
FabioLuporini Sep 11, 2023
4594ff6
compiler: Speedup DDA by postponing distance calculation
FabioLuporini Sep 12, 2023
c96b266
compiler: Speedup IterationSpace.project
FabioLuporini Sep 12, 2023
7a3bd90
compiler: Speedup DDA avoiding redundant TimedAccesses
FabioLuporini Sep 12, 2023
b9e1056
compiler: Speedup Dependence creation
FabioLuporini Sep 12, 2023
9ef9200
compiler: Speedup DDA by lazily generating writes
FabioLuporini Sep 12, 2023
8b85d75
compiler: Refactor factorizer to enable external customization
FabioLuporini Sep 13, 2023
0c520a8
compiler: Revamp and simplify CIRE's cost model
FabioLuporini Sep 13, 2023
503f2ec
compiler: Fix unexpansion of tensor objects
FabioLuporini Sep 18, 2023
5d843fb
compiler: Speedup instantiation of DiscreteFunctions
FabioLuporini Sep 18, 2023
e72f8e9
compiler: Speedup ordering of IndexDerivatives
FabioLuporini Sep 18, 2023
3702873
compiler: Improve StencilDimension interface
FabioLuporini Sep 19, 2023
bfa7375
compiler: Remove post-rebase redundant meth
FabioLuporini Oct 6, 2023
9580746
compiler: Fix IndexDerivative lowering
FabioLuporini Oct 9, 2023
b4264b1
compiler: Restrict fusion of FetchUpdates
FabioLuporini Oct 9, 2023
c1ebe2f
api: cleanup fd transpose implementation
mloubout Oct 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ def parallel(item):
# OpenMPI requires an explicit flag for oversubscription. We need it as some
# of the MPI tests will spawn lots of processes
if mpi_distro == 'OpenMPI':
call = [mpi_exec, '--oversubscribe', '--timeout', '150'] + args
call = [mpi_exec, '--oversubscribe', '--timeout', '300'] + args
else:
call = [mpi_exec] + args

Expand Down
15 changes: 8 additions & 7 deletions devito/core/operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,9 +329,9 @@ class OptOption(object):

class ParTileArg(tuple):

def __new__(cls, items, shm=0, tag=None):
def __new__(cls, items, rule=None, tag=None):
obj = super().__new__(cls, items)
obj.shm = shm
obj.rule = rule
obj.tag = tag
return obj

Expand Down Expand Up @@ -371,14 +371,15 @@ def __new__(cls, items, default=None):

try:
y = items[1]
if is_integer(y):
# E.g., ((32, 4, 8), 1)
# E.g., ((32, 4, 8), 1, 'tag')
if is_integer(y) or isinstance(y, str) or y is None:
# E.g., ((32, 4, 8), 'rule')
# E.g., ((32, 4, 8), 'rule', 'tag')
items = (ParTileArg(*items),)
else:
try:
# E.g., (((32, 4, 8), 1), ((32, 4, 4), 2))
# E.g., (((32, 4, 8), 1, 'tag0'), ((32, 4, 4), 2, 'tag1'))
# E.g., (((32, 4, 8), 'rule'), ((32, 4, 4), 'rule'))
# E.g., (((32, 4, 8), 'rule0', 'tag0'),
# ((32, 4, 4), 'rule1', 'tag1'))
items = tuple(ParTileArg(*i) for i in items)
except TypeError:
# E.g., ((32, 4, 8), (32, 4, 4))
Expand Down
24 changes: 23 additions & 1 deletion devito/finite_differences/differentiable.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import sympy
from sympy.core.add import _addsort
from sympy.core.mul import _keep_coeff, _mulsort
from sympy.core.core import ordering_of_classes
from sympy.core.decorators import call_highest_priority
from sympy.core.evalf import evalf_table

Expand Down Expand Up @@ -556,6 +557,9 @@ def __repr__(self):

__str__ = __repr__

def _sympystr(self, printer):
return str(self)

def _hashable_content(self):
return super()._hashable_content() + (self.dimensions,)

Expand Down Expand Up @@ -621,7 +625,7 @@ def __eq__(self, other):
__hash__ = sympy.Basic.__hash__

def _hashable_content(self):
return (self.name, self.dimension, hash(tuple(self.weights)))
return (self.name, self.dimension, str(self.weights))

@property
def dimension(self):
Expand Down Expand Up @@ -665,6 +669,20 @@ def __new__(cls, expr, mapper, **kwargs):
def _hashable_content(self):
return super()._hashable_content() + (self.mapper,)

def compare(self, other):
if self is other:
return 0
n1 = self.__class__
n2 = other.__class__
if n1.__name__ == n2.__name__:
return self.base.compare(other.base)
else:
return super().compare(other)

@cached_property
def base(self):
return self.expr.func(*[a for a in self.expr.args if a is not self.weights])

@property
def weights(self):
return self._weights
Expand Down Expand Up @@ -693,6 +711,10 @@ def _evaluate(self, **kwargs):
return expr


ordering_of_classes.insert(ordering_of_classes.index('Derivative') + 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding comment in upcoming PR

'IndexDerivative')


class EvalDerivative(DifferentiableOp, sympy.Add):

is_commutative = True
Expand Down
16 changes: 11 additions & 5 deletions devito/finite_differences/finite_difference.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,9 +207,11 @@ def generic_derivative(expr, dim, fd_order, deriv_order, matvec=direct, x0=None,
matvec, x0, symbolic, expand)


def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic, expand):
def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic,
expand):
# The stencil indices
indices, x0 = generate_indices(expr, dim, fd_order, side=side, matvec=matvec, x0=x0)
indices, x0 = generate_indices(expr, dim, fd_order, side=side, matvec=matvec,
x0=x0)

# Finite difference weights from Taylor approximation given these positions
if symbolic:
Expand All @@ -218,15 +220,19 @@ def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic
weights = numeric_weights(deriv_order, indices, x0)

# Enforce fixed precision FD coefficients to avoid variations in results
weights = [sympify(w).evalf(_PRECISION) for w in weights]
weights = [sympify(w).evalf(_PRECISION) for w in weights][::matvec.val]

# Transpose the FD, if necessary
if matvec:
indices = indices.scale(matvec.val)
if matvec == transpose:
indices = indices.transpose()

# Shift index due to staggering, if any
indices = indices.shift(-(expr.indices_ref[dim] - dim))

# The user may wish to restrict expansion to selected derivatives
if callable(expand):
expand = expand(dim)

if not expand and indices.expr is not None:
weights = Weights(name='w', dimensions=indices.free_dim, initvalue=weights)

Expand Down
16 changes: 11 additions & 5 deletions devito/finite_differences/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,27 +175,33 @@ def __repr__(self):
def spacing(self):
return self.dim.spacing

def scale(self, v):
def transpose(self):
"""
Construct a new IndexSet with all indices scaled by `v`.
Transpose the IndexSet.
"""
mapper = {self.spacing: v*self.spacing}
mapper = {self.spacing: -self.spacing}

indices = []
for i in self:
for i in reversed(self):
try:
iloc = i.xreplace(mapper)
except AttributeError:
# Pure number -> sympify
iloc = sympify(i).xreplace(mapper)
indices.append(iloc)

try:
free_dim = self.free_dim.transpose()
mapper.update({self.free_dim: -free_dim})
except AttributeError:
free_dim = self.free_dim

try:
expr = self.expr.xreplace(mapper)
except AttributeError:
expr = None

return IndexSet(self.dim, indices, expr=expr, fd=self.free_dim)
return IndexSet(self.dim, indices, expr=expr, fd=free_dim)

def shift(self, v):
"""
Expand Down
49 changes: 40 additions & 9 deletions devito/ir/clusters/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
import sympy

from devito.exceptions import InvalidOperator
from devito.ir.support import (Any, Backward, Forward, IterationSpace,
from devito.ir.support import (Any, Backward, Forward, IterationSpace, erange,
pull_dims)
from devito.ir.clusters.analysis import analyze
from devito.ir.clusters.cluster import Cluster, ClusterGroup
from devito.ir.clusters.visitors import Queue, QueueStateful, cluster_pass
from devito.mpi.halo_scheme import HaloScheme, HaloTouch
from devito.symbolics import retrieve_indexed, uxreplace, xreplace_indices
from devito.tools import (DefaultOrderedDict, Stamp, as_mapper, flatten,
is_integer, timed_pass)
is_integer, timed_pass, toposort)
from devito.types import Array, Eq, Symbol
from devito.types.dimension import BOTTOM, ModuloDimension

Expand All @@ -29,6 +29,7 @@ def clusterize(exprs, **kwargs):
clusters = [Cluster(e, e.ispace) for e in exprs]

# Setup the IterationSpaces based on data dependence analysis
clusters = impose_total_ordering(clusters)
clusters = Schedule().process(clusters)

# Handle SteppingDimensions
Expand All @@ -49,6 +50,29 @@ def clusterize(exprs, **kwargs):
return ClusterGroup(clusters)


def impose_total_ordering(clusters):
"""
Create a new sequence of Clusters whose IterationSpaces are totally ordered
according to a global set of relations.
"""
global_relations = set().union(*[c.ispace.relations for c in clusters])
ordering = toposort(global_relations)

processed = []
for c in clusters:
key = lambda d: ordering.index(d)
try:
relations = {tuple(sorted(c.ispace.itdims, key=key))}
except ValueError:
# See issue #2204
relations = c.ispace.relations
ispace = c.ispace.reorder(relations=relations, mode='total')

processed.append(c.rebuild(ispace=ispace))

return processed


class Schedule(QueueStateful):

"""
Expand Down Expand Up @@ -121,10 +145,12 @@ def callback(self, clusters, prefix, backlog=None, known_break=None):
require_break = scope.d_flow.cause & maybe_break
if require_break:
backlog = [clusters[-1]] + backlog
# Try with increasingly smaller ClusterGroups until the ambiguity is gone
# Try with increasingly smaller ClusterGroups until the
# ambiguity is gone
return self.callback(clusters[:-1], prefix, backlog, require_break)

# Schedule Clusters over different IterationSpaces if this increases parallelism
# Schedule Clusters over different IterationSpaces if this increases
# parallelism
for i in range(1, len(clusters)):
if self._break_for_parallelism(scope, candidates, i):
return self.callback(clusters[:i], prefix, clusters[i:] + backlog,
Expand All @@ -146,8 +172,8 @@ def callback(self, clusters, prefix, backlog=None, known_break=None):
if not backlog:
return processed

# Handle the backlog -- the Clusters characterized by flow- and anti-dependences
# along one or more Dimensions
# Handle the backlog -- the Clusters characterized by flow- and
# anti-dependences along one or more Dimensions
idir = {d: Any for d in known_break}
stamp = Stamp()
for i, c in enumerate(list(backlog)):
Expand Down Expand Up @@ -278,7 +304,11 @@ def callback(self, clusters, prefix):
size = i.function.shape_allocated[d]
assert is_integer(size)

mapper[size][si].add(iaf)
# Resolve StencilDimensions in case of unexpanded expressions
# E.g. `i0 + t` -> `(t - 1, t, t + 1)`
iafs = erange(iaf)

mapper[size][si].update(iafs)

# Construct the ModuloDimensions
mds = []
Expand All @@ -288,7 +318,8 @@ def callback(self, clusters, prefix):
# SymPy's index ordering (t, t-1, t+1) afer modulo replacement so
# that associativity errors are consistent. This corresponds to
# sorting offsets {-1, 0, 1} as {0, -1, 1} assigning -inf to 0
siafs = sorted(iafs, key=lambda i: -np.inf if i - si == 0 else (i - si))
key = lambda i: -np.inf if i - si == 0 else (i - si)
siafs = sorted(iafs, key=key)

for iaf in siafs:
name = '%s%d' % (si.name, len(mds))
Expand Down Expand Up @@ -452,7 +483,7 @@ def normalize_reductions_dense(cluster, sregistry, options):
"""
opt_mapify_reduce = options['mapify-reduce']

dims = [d for d in cluster.properties.dimensions
dims = [d for d in cluster.ispace.itdims
if cluster.properties.is_parallel_atomic(d)]

if not dims:
Expand Down
Loading
Loading