## 04 August 2014

### MPI parallelism

After quite a struggle, I have been able to obtain a working implementation of fem-fenics supporting MPI parallelism. Let's go through an example and highlight what has changed lately.

pkg load fem-fenics msh

ufl start Poisson
ufl element = FiniteElement '("Lagrange", triangle, 1)'
ufl u = TrialFunction (element)
ufl v = TestFunction (element)
ufl f = Coefficient (element)
ufl g = Coefficient (element)
ufl a = "inner (grad (u), grad (v))*dx"
ufl L = f*v*dx + g*v*ds
ufl end

# Create mesh and define function space
x = y = linspace (0, 1, 33);
[mesh, facets] = Mesh (msh2m_structured_mesh (x, y, 1, 1:4));

V = FunctionSpace ('Poisson', mesh);

# Define boundary condition
bc = DirichletBC (V, @(x, y) 0.0, facets, [2;4]);

f = Expression ('f', @(x,y) 10*exp(-((x - 0.5)^2 + (y - 0.5)^2) / 0.02));
g = Expression ('g', @(x,y) sin (5.0 * x));

a = BilinearForm ('Poisson', V, V);
L = LinearForm ('Poisson', V, f, g);

# Compute solution
[A, b] = assemble_system (a, L, bc);
sol = A \ b;
u = Function ('u', V, sol);

# Save solution in VTK format
save (u, 'poisson');

The basic structure has remained the same. DOLFIN boasts the capability to be run both in serial and in parallel execution without intervening on the code, so I did my best to have the same behaviour from fem-fenics. The Poisson.m m-file above can be run either as you usually would do with any other m-file, or from the command line with an invocation such as:

mpiexec -np 4 octave --eval Poisson

Now, how is this possible? In the beginning, with the ufl block, the variational problem is defined in UFL language, written to an .ufl file and compiled via FFC. Since IO is performed, ufl.m ensures that only process zero will open and write to the file. Moreover, a MPI barrier makes sure that no process will proceed before the .ufl file is imported.

As soon as the just-in-time compilation is over, there are two instructions to build the mesh, in this case on the unit square. For this, we rely on the msh package, which returns a PDE-tool-like representation of it. Mesh.oct must, then, convert it to DOLFIN internal representation and distribute it among processes. Here comes an issue: fem-fenics relies on markers present in the PDE-tool format to impose essential boundary conditions, and in serial runs dolfin::Mesh can store them, so that DirichletBC.oct needs just to know the boundary subset label. Unfortunately, this feature is not supported yet in parallel by the DOLFIN library, then Mesh.oct has been edited to return, if requested, also a meshfunction holding this information, in the example above facets. This way markers can be conveyed to DirichletBC.oct and boundary conditions can be applied on the correct edges.

Further intervention was needed for the assembly and solve phase. In assemble_system.oct both the matrix and the vector are assembled locally on each portion of the mesh and, afterwards, gathered on process zero and joined, so that the system can be solved with the backslash instruction of Octave. In order to allow output in VTK format, in Function.oct the solution is split up and properly distributed among processes, so that each one holds the portion of degrees of freedom related to its subdomain and to the neighbouring vertices. After save.oct has written the solution to poisson.pvd and its auxiliary files, it can be visualised with ParaView.