How do we get performance-portable finite element solvers that are efficient, generic and easy to use in the hands of domain scientists?
... to isolate numerical methods from their mapping to hardware
... and make decisions at the highest abstraction level possible
... for generative, instead of transformative optimisations
... capture and efficiently express characteristics of the application/problem domain
... encapsulates specialist expertise to deliver problem- and platform-specific optimisations
... makes problem-specific generated code transparently available to the application at runtime
The weak form of the Helmholtz equation
is expressed in UFL as follows:
1 FFC is the FEniCS Form Compiler, 2 UFL is the Unified Form Language from the FEniCS project
... means computing the same kernel for every mesh entity (cell, facet): a perfect match for the PyOP2 abstraction
... implemented as a thin wrapper on top of backend-specific linear algebra packages: PETSc4py on the CPU, Cusp on the GPU
Measure total time to solution for 100 time steps of an advection-diffusion test case; matrix/vector re-assembled every time step.
CG with Jacobi preconditioning using PETSc 3.3 (PyOP2), 3.2 (DOLFIN)
2x Intel Xeon E5650 Westmere 6-core (HT off), 48GB RAM
NVIDIA GeForce GTX680 (Kepler)
2D unit square meshed with triangles (200 - 204800 elements)
Revision 7122, Tensor representation, CPP optimisations on
https://code.launchpad.net/~mapdes/ffc/pyop2
https://code.launchpad.net/~fluidity-core/fluidity/pyop2
https://github.com/OP2/PyOP2_benchmarks
/