Getting started

Basic usage

Shenfun consists of classes and functions whoose purpose are to make it easier to implement PDE’s with spectral methods in simple tensor product domains. The most important everyday tools are

A good place to get started is by creating a Basis(). There are five families of bases: Fourier, Chebyshev, Legendre, Laguerre, Hermite and Jacobi. All bases are defined on a one-dimensional domain, with their own basis functions and quadrature points. For example, we have the regular Chebyshev basis \(\{T_k\}_{k=0}^{N-1}\), where \(T_k\) is the \(k\)’th Chebyshev polynomial of the first kind. To create such a basis with 8 quadrature points (i.e., \(\{T_k\}_{k=0}^{7}\)) do:

from shenfun import Basis
N = 8
T = Basis(N, 'Chebyshev', bc=None)

Here bc=None is used to indicate that there are no boundary conditions associated with this basis, which is the default, so it could just as well have been left out. To create a regular Legendre basis (i.e., \(\{L_k\}_{k=0}^{N-1}\), where \(L_k\) is the \(k\)’th Legendre polynomial), just replace Chebyshev with Legendre above. And to create a Fourier basis, just use Fourier.

The basis \(T = \{T_k\}_{k=0}^{N-1}\) has many useful methods associated with it, and we may experiment a little. A Function u using basis \(T\) has expansion

(1)\[ u(x) = \sum_{k=0}^{7} \hat{u}_k T_k(x)\]

and an instance of this function (initialized with \(\{\hat{u}_k\}_{k=0}^7=0\)) is created in shenfun as:

u = Function(T)

Consider now for exampel the polynomial \(2x^2-1\), which happens to be exactly equal to \(T_2(x)\). We can create this polynomial using sympy

import sympy as sp
x = sp.Symbol('x')
u = 2*x**2 - 1  # or simply u = sp.chebyshevt(2, x)

The Sympy function u can now be evaluated on the quadrature points of basis \(T\):

xj = T.mesh()
ue = Array(T)
ue[:] = [u.subs(x, xx) for xx in xj]
  [ 0.98078528  0.83146961  0.55557023  0.19509032 -0.19509032 -0.55557023
   -0.83146961 -0.98078528]
  [ 0.92387953  0.38268343 -0.38268343 -0.92387953 -0.92387953 -0.38268343
    0.38268343  0.92387953]

We see that ue is an Array on the basis T, and not a Function. The Array and Function classes are both subclasses of Numpy’s ndarray, and represent the two arrays associated with the spectral Galerkin function, like (1). The Function represent the entire spectral Galerkin function, with array values corresponding to the expansion coefficients \(\hat{u}\). The Array represent the spectral Galerkin function evaluated on the quadrature mesh of the basis T, i.e., here \(u(x_i), \forall \, i \in 0, 1, \ldots, 7\).

We now want to find the Function uh corresponding to Array ue. Considering (1), this corresponds to finding \(\hat{u}_k\) if the left hand side \(u(x_j)\) is known for all quadrature points \(x_j\).

Since we already know that ue is equal to the second Chebyshev polynomial, we should get an array of expansion coefficients equal to \(\hat{u} = (0, 0, 1, 0, 0, 0, 0, 0)\). We can compute uh either by using project() or a forward transform:

uh = Function(T)
uh = T.forward(ue, uh)
# or
# uh = ue.forward(uh)
# or
# uh = project(ue, T)
  [-1.38777878e-17  6.72002101e-17  1.00000000e+00 -1.95146303e-16
    1.96261557e-17  1.15426347e-16 -1.11022302e-16  1.65163507e-16]

So we see that the projection works to machine precision.

The projection is mathematically: find \(u_h \in T\), such that

\[(u_h - u, v)_w = 0 \quad \forall v \in T,\]

where \(v\) is a test function, \(u_h\) is a trial function and the notation \((\cdot, \cdot)_w\) was introduced in (4). Using now \(v=T_k\) and \(u_h=\sum_{j=0}^7 \hat{u}_j T_j\), we get

\[\begin{split}(\sum_{j=0}^7 \hat{u}_j T_j, T_k)_w &= (u, T_k)_w, \\ \sum_{j=0}^7 (T_j, T_k)_w \hat{u}_j &= (u, T_k)_w,\end{split}\]

for all \(k \in 0, 1, \ldots, 7\). This can be rewritten on matrix form as

\[B_{kj} \hat{u}_j = \tilde{u}_k,\]

where \(B_{kj} = (T_j, T_k)_w\), \(\tilde{u}_k = (u, T_k)_w\) and summation is implied by the repeating \(j\) indices. Since the Chebyshev polynomials are orthogonal the mass matrix \(B_{kj}\) is diagonal. We can assemble both \(B_{kj}\) and \(\tilde{u}_j\) with shenfun, and at the same time introduce the TestFunction, TrialFunction classes and the inner() function:

from shenfun import TestFunction, TrialFunction, inner
u = TrialFunction(T)
v = TestFunction(T)
B = inner(u, v)
u_tilde = inner(ue, v)
  {0: array([3.14159265, 1.57079633, 1.57079633, 1.57079633, 1.57079633,
   1.57079633, 1.57079633, 1.57079633])}
  [-4.35983562e-17  1.05557843e-16  1.57079633e+00 -3.06535096e-16
    3.08286933e-17  1.81311282e-16 -1.74393425e-16  2.59438230e-16]

The inner() function represents the (weighted) inner product and it expects one test function, and possibly one trial function. If, as here, it also contains a trial function, then a matrix is returned. If inner() contains one test, but no trial function, then an array is returned. Finally, if inner() contains no test nor trial function, but instead a number and an Array, like:

a = Array(T, val=1)
print(inner(1, a))

then inner() represents a non-weighted integral over the domain. Here it returns the length of the domain (2.0) since a is initialized to unity.

Note that the matrix \(B\) assembled above is stored using shenfun’s SpectralMatrix class, which is a subclass of Python’s dictionary, where the keys are the diagonals and the values are the diagonal entries. The matrix \(B\) is seen to have only one diagonal (the principal) \(\{B_{ii}\}_{i=0}^{7}\).

With the matrix comes a solve method and we can solve for \(\hat{u}\) through:

u_hat = Function(T)
u_hat = B.solve(u_tilde, u=u_hat)
  [-1.38777878e-17  6.72002101e-17  1.00000000e+00 -1.95146303e-16
    1.96261557e-17  1.15426347e-16 -1.11022302e-16  1.65163507e-16]

which obviously is exactly the same as we found using project() or the T.forward function.

Note that Array merely is a subclass of Numpy’s ndarray, whereas Function is a subclass of both Numpy’s ndarray and the BasisFunction class. The latter is used as a base class for arguments to bilinear and linear forms, and is as such a base class also for TrialFunction and TestFunction. An instance of the Array class cannot be used in forms, except from regular inner products of numbers or test function vs an Array. To illustrate, lets create some forms, where all except the last one is ok:

T = Basis(12, 'Legendre')
u = TrialFunction(T)
v = TestFunction(T)
uf = Function(T)
ua = Array(T)
A = inner(v, u)   # Mass matrix
c = inner(v, ua)  # ok, a scalar product
d = inner(v, uf)  # ok, a scalar product (slower than above)
e = inner(1, ua)  # ok, non-weighted integral of ua over domain
df = Dx(uf, 0, 1) # ok
da = Dx(ua, 0, 1) # Not ok

    AssertionError                            Traceback (most recent call last)
    <ipython-input-14-3b957937279f> in <module>
    ----> 1 da = inner(v, Dx(ua, 0, 1))

    ~/MySoftware/shenfun/shenfun/forms/ in Dx(test, x, k)
         82         Number of derivatives
         83     """
    ---> 84     assert isinstance(test, (Expr, BasisFunction))
         86     if isinstance(test, BasisFunction):


So it is not possible to perform operations that involve differentiation (Dx represents a partial derivative) on an Array instance. This is because the ua does not contain more information than its values and its TensorProductSpace. A BasisFunction instance, on the other hand, can be manipulated with operators like div() grad() in creating instances of the Expr class, see Operators.

Note that any rules for efficient use of Numpy ndarrays, like vectorization, also applies to Function and Array instances.


Operators act on any single instance of a BasisFunction, which can be Function, TrialFunction or TestFunction. The implemented operators are:

Operators are used in variational forms assembled using inner() or project(), like:

A = inner(grad(u), grad(v))

which assembles a stiffness matrix A. Note that the two expressions fed to inner must have consistent rank. Here, for example, both grad(u) and grad(v) have rank 1 of a vector.

Multidimensional problems

As described in the introduction, a multidimensional problem is handled using tensor product spaces, that have basis functions generated from taking the outer products of one-dimensional basis functions. We create tensor product spaces using the class TensorProductSpace:

N, M = (12, 16)
C0 = Basis(N, 'L', bc=(0, 0), scaled=True)
K0 = Basis(M, 'F', dtype='d')
T = TensorProductSpace(comm, (C0, K0))

Associated with this is a Cartesian mesh \([-1, 1] \times [0, 2\pi]\). We use classes Function, TrialFunction and TestFunction exactly as before:

u = TrialFunction(T)
v = TestFunction(T)
A = inner(grad(u), grad(v))

However, now A will be a tensor product matrix, or more correctly, the sum of two tensor product matrices. This can be seen if we look at the equations beyond the code. In this case we are using a composite Legendre basis for the first direction and Fourier exponentials for the second, and the tensor product basis function is

\[\begin{split}v_{kl}(x, y) &= \frac{1}{\sqrt{4k+6}}(L_k(x) - L_{k+2}(x)) \exp(\imath l y), \\ &= \Psi_k(x) \phi_l(y),\end{split}\]

where \(L_k\) is the \(k\)’th Legendre polynomial, \(\psi_k = (L_k-L_{k+2})/\sqrt{4k+6}\) and \(\phi_l = \exp(\imath l y)\) are used for simplicity in later derivations. The trial function becomes

\[u(x, y) = \sum_k \sum_l \hat{u}_{kl} v_{kl}\]

and the inner product is

(2)\[\begin{split}(\nabla u, \nabla v)_w &= \int_{-1}^{1} \int_{0}^{2 \pi} \nabla u \cdot \nabla v dxdy, \\ &= \int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial u}{\partial x} \frac{\partial v}{\partial x} + \frac{\partial u}{\partial y}\frac{\partial v}{\partial y} dxdy, \\ &= \int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial u}{\partial x} \frac{\partial v}{\partial x} dxdy + \int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial u}{\partial y} \frac{\partial v}{\partial y} dxdy,\end{split}\]

showing that it is the sum of two tensor product matrices. However, each one of these two terms contains the outer product of smaller matrices. To see this we need to insert for the trial and test functions (using \(v_{mn}\) for test):

\[\begin{split}\int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial u}{\partial x} \frac{\partial v}{\partial x} dxdy &= \int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial}{\partial x} \left( \sum_k \sum_l \hat{u}_{kl} \Psi_k(x) \phi_l(y) \right) \frac{\partial}{\partial x} \left( \Psi_m(x) \phi_n(y) \right)dxdy, \\ &= \sum_k \sum_l \underbrace{ \int_{-1}^{1} \frac{\partial \Psi_k(x)}{\partial x} \frac{\partial \Psi_m(x)}{\partial x} dx}_{A_{mk}} \underbrace{ \int_{0}^{2 \pi} \phi_l(y) \phi_{n}(y) dy}_{B_{nl}} \, \hat{u}_{kl},\end{split}\]

where \(A \in \mathbb{R}^{N-2 \times N-2}\) and \(B \in \mathbb{R}^{M \times M}\). The tensor product matrix \(A_{mk} B_{nl}\) (or in matrix notation \(A \otimes B\)) is the first item of the two items in the list that is returned by inner(grad(u), grad(v)). The other item is of course the second term in the last line of (2):

\[\begin{split}\int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial u}{\partial y} \frac{\partial v}{\partial y} dxdy &= \int_{-1}^{1} \int_{0}^{2 \pi} \frac{\partial}{\partial y} \left( \sum_k \sum_l \hat{u}_{kl} \Psi_k(x) \phi_l(y) \right) \frac{\partial}{\partial y} \left(\Psi_m(x) \phi_n(y) \right) dxdy \\ &= \sum_k \sum_l \underbrace{ \int_{-1}^{1} \Psi_k(x) \Psi_m(x) dx}_{C_{mk}} \underbrace{ \int_{0}^{2 \pi} \frac{\partial \phi_l(y)}{\partial y} \frac{ \phi_{n}(y) }{\partial y} dy}_{D_{nl}} \, \hat{u}_{kl}\end{split}\]

The tensor product matrices \(A_{mk} B_{nl}\) and \(C_{mk}D_{nl}\) are both instances of the TPMatrix class. Together they lead to linear algebra systems like:

(3)\[(A_{mk}B_{nl} + C_{mk}D_{nl}) \hat{u}_{kl} = \tilde{f}_{mn},\]


\[\tilde{f}_{mn} = (v_{mn}, f)_w,\]

for some right hand side \(f\), see, e.g., (5). Note that an alternative formulation here is

\[A \hat{u} B^T + C \hat{u} D^T = \tilde{f}\]

where \(\hat{u}\) and \(\tilde{f}\) are treated as regular matrices (\(\hat{u} \in \mathbb{R}^{N-2 \times M}\) and \(\tilde{f} \in \mathbb{R}^{N-2 \times M}\)). This formulation is utilized to derive efficient solvers for tensor product bases in multiple dimensions using the matrix decomposition method in [She94] and [She95].

Note that in our case the equation system (3) can be greatly simplified since three of the submatrices (\(A_{mk}, B_{nl}\) and \(D_{nl}\)) are diagonal. Even more, two of them equal the identity matrix

\[\begin{split}A_{mk} &= \delta_{mk}, \\ B_{nl} &= \delta_{nl},\end{split}\]

whereas the last one can be written in terms of the identity (no summation on repeating indices)

\[D_{nl} = -nl\delta_{nl}.\]

Inserting for this in (3) and simplifying by requiring that \(l=n\) in the second step, we get

(4)\[\begin{split}(\delta_{mk}\delta_{nl} - ln C_{mk}\delta_{nl}) \hat{u}_{kl} &= \tilde{f}_{mn}, \\ (\delta_{mk} - l^2 C_{mk}) \hat{u}_{kl} &= \tilde{f}_{ml}.\end{split}\]

Now if we keep \(l\) fixed this latter equation is simply a regular linear algebra problem to solve for \(\hat{u}_{kl}\), for all \(k\). Of course, this solve needs to be carried out for all \(l\).

Note that there is a generic solver available for the system (3) in SolverGeneric2NP that makes no assumptions on diagonality. However, this solver will, naturally, be quite a bit slower than a tailored solver that takes advantage of diagonality. For the Poisson equation such solvers are available for both Legendre and Chebyshev bases, see the extended demo Demo - 3D Poisson’s equation or the demo programs and

Coupled problems

With Shenfun it is possible to solve equations coupled and implicit using the MixedTensorProductSpace class for multidimensional problems and MixedBasis for one-dimensional problems. As an example, lets consider a mixed formulation of the Poisson equation. The Poisson equation is given as always as

(5)\[\nabla^2 u(\boldsymbol{x}) = f(\boldsymbol{x}), \quad \text{for} \quad \boldsymbol{x} \in \Omega,\]

but now we recast the problem into a mixed formulation

\[\begin{split}\sigma(\boldsymbol{x})- \nabla u (\boldsymbol{x})&= 0, \quad \text{for} \quad \boldsymbol{x} \in \Omega, \\ \nabla \cdot \sigma (\boldsymbol{x})&= f(\boldsymbol{x}), \quad \text{for} \quad \boldsymbol{x} \in \Omega.\end{split}\]

where we solve for the vector \(\sigma\) and scalar \(u\) simultaneously. The domain \(\Omega\) is taken as a multidimensional Cartesian product \(\Omega=[-1, 1] \times [0, 2\pi]\), but the code is more or less identical for a 3D problem. For boundary conditions we use Dirichlet in the \(x\)-direction and periodicity in the \(y\)-direction:

\[\begin{split}u(\pm 1, y) &= 0 \\ u(x, 2\pi) &= u(x, 0)\end{split}\]

Note that there is no boundary condition on \(\sigma\), only on \(u\). For this reason we choose a Dirichlet basis \(SD\) for \(u\) and a regular Legendre or Chebyshev \(ST\) basis for \(\sigma\). With \(K0\) representing the function space in the periodic direction, we get the relevant 2D tensor product spaces as \(TD = SD \otimes K0\) and \(TT = ST \otimes K0\). Since \(\sigma\) is a vector we use a VectorTensorProductSpace \(VT = TT \times TT\) and finally a MixedTensorProductSpace \(Q = VT \times TD\) for the coupled and implicit treatment of \((\sigma, u)\):

N, M = (16, 24)
family = 'Legendre'
SD = Basis(N[0], family, bc=(0, 0))
ST = Basis(N[0], family)
K0 = Basis(N[1], 'Fourier', dtype='d')
TD = TensorProductSpace(comm, (SD, K0), axes=(0, 1))
TT = TensorProductSpace(comm, (ST, K0), axes=(0, 1))
VT = VectorTensorProductSpace(TT)
Q = MixedTensorProductSpace([VT, TD])

In variational form the problem reads: find \((\sigma, u) \in Q\) such that

(6)\[\begin{split}(\sigma, \tau)_w - (\nabla u, \tau)_w &= 0, \quad \forall \tau \in VT, \\ (\nabla \cdot \sigma, v)_w &= (f, v)_w \quad \forall v \in TD\end{split}\]

To implement this we use code that is very similar to regular, uncoupled problems. We create test and trialfunction:

gu = TrialFunction(Q)
tv = TestFunction(Q)
sigma, u = gu
tau, v = tv

and use these to assemble all blocks of the variational form (6):

# Assemble equations
A00 = inner(sigma, tau)
if family.lower() == 'legendre':
    A01 = inner(u, div(tau))
    A01 = inner(-grad(u), tau)
A10 = inner(div(sigma), v)

Note that we here can use integration by parts for Legendre, since the weight function is a constant, and as such get the term \((-\nabla u, \tau)_w = (u, \nabla \cdot \tau)_w\) (boundary term is zero due to homogeneous Dirichlet boundary conditions).

We collect all assembled terms in a BlockMatrix:

H = BlockMatrix(A00+A01+A10)

This block matrix H is then simply (for Legendre)

(7)\[\begin{split}\begin{bmatrix} (\sigma, \tau)_w & (u, \nabla \cdot \tau)_w \\ (\nabla \cdot \sigma, v)_w & 0 \end{bmatrix}\end{split}\]

Note that each item in (7) is a collection of instances of the TPMatrix class, and for similar reasons as given around (4), we get also here one regular block matrix for each Fourier wavenumber. The sparsity pattern is the same for all matrices except for wavenumber 0. The (highly sparse) sparsity pattern for block matrix \(H\) with wavenumber \(\ne 0\) is shown in the image below


A complete demo for the coupled problem discussed here can be found in and a 3D version is in


The integrators module contains some interator classes that can be used to integrate a solution forward in time. However, for now these integrators are only implemented for purely Fourier tensor product spaces. There are currently 3 different integrator classes

  • RK4: Runge-Kutta fourth order

  • ETD: Exponential time differencing Euler method

  • ETDRK4: Exponential time differencing Runge-Kutta fourth order

See, e.g., H. Montanelli and N. Bootland “Solving periodic semilinear PDEs in 1D, 2D and 3D with exponential integrators”,

Integrators are set up to solve equations like

(8)\[ \frac{\partial u}{\partial t} = L u + N(u)\]

where \(u\) is the solution, \(L\) is a linear operator and \(N(u)\) is the nonlinear part of the right hand side.

To illustrate, we consider the time-dependent 1-dimensional Kortveeg-de Vries equation

\[\frac{\partial u}{\partial t} + \frac{\partial ^3 u}{\partial x^3} + u \frac{\partial u}{\partial x} = 0\]

which can also be written as

\[\frac{\partial u}{\partial t} + \frac{\partial ^3 u}{\partial x^3} + \frac{1}{2}\frac{\partial u^2}{\partial x} = 0\]

We neglect boundary issues and choose a periodic domain \([0, 2\pi]\) with Fourier exponentials as test functions. The initial condition is chosen as

(9)\[ u(x, t=0) = 3 A^2/\cosh(0.5 A (x-\pi+2))^2 + 3B^2/\cosh(0.5B(x-\pi+1))^2\]

where \(A\) and \(B\) are constants. For discretization in space we use the basis \(V_N = span\{exp(\imath k x)\}_{k=0}^N\) and formulate the variational problem: find \(u \in V_N\) such that

\[\frac{\partial }{\partial t} \Big(u, v \Big) = -\Big(\frac{\partial^3 u }{\partial x^3}, v \Big) - \Big(\frac{1}{2}\frac{\partial u^2}{\partial x}, v\Big), \quad \forall v \in V_N\]

We see that the first term on the right hand side is linear in \(u\), whereas the second term is nonlinear. To implement this problem in shenfun we start by creating the necessary basis and test and trial functions

import numpy as np
from shenfun import *

N = 256
T = Basis(N, 'F', dtype='d')
u = TrialFunction(T)
v = TestFunction(T)
u_ = Array(T)
u_hat = Function(T)

We then create two functions representing the linear and nonlinear part of (8):

def LinearRHS(**params):
    return -inner(Dx(u, 0, 3), v)

k = T.wavenumbers(scaled=True, eliminate_highest_freq=True)
def NonlinearRHS(u, u_hat, rhs, **params):
    u_[:] = T.backward(u_hat, u_)
    rhs = T.forward(-0.5*u_**2, rhs)
    return rhs*1j*k   # return inner(grad(-0.5*Up**2), v)

Note that we differentiate in NonlinearRHS by using the wavenumbers k directly. Alternative notation, that is given in commented out text, is slightly slower, but the results are the same.

The solution vector u_ needs also to be initialized according to (9)

A = 25.
B = 16.
x = T.points_and_weights()[0]
u_[:] = 3*A**2/np.cosh(0.5*A*(x-np.pi+2))**2 + 3*B**2/np.cosh(0.5*B*(x-np.pi+1))**2
u_hat = T.forward(u_, u_hat)

Finally we create an instance of the ETDRK4 solver, and integrate forward with a given timestep

dt = 0.01/N**2
end_time = 0.006
integrator = ETDRK4(T, L=LinearRHS, N=NonlinearRHS)
u_hat = integrator.solve(u_, u_hat, dt, (0, end_time))

The solution is two waves travelling through eachother, seemingly undisturbed.



Shenfun makes use of the Message Passing Interface (MPI) to solve problems on distributed memory architectures. OpenMP is also possible to enable for FFTs.

Dataarrays in Shenfun are distributed using a new and completely generic method, that allows for any index of a multidimensional array to be distributed. To illustrate, lets consider a TensorProductSpace of three dimensions, such that the arrays living in this space will be 3-dimensional. We create two spaces that are identical, except from the MPI decomposition, and we use 4 CPUs (mpirun -np 4 python, if we store the code in this section as

from shenfun import *
from mpi4py import MPI
from mpi4py_fft import generate_xdmf
N = (20, 40, 60)
K0 = Basis(N[0], 'F', dtype='D', domain=(0, 1))
K1 = Basis(N[1], 'F', dtype='D', domain=(0, 2))
K2 = Basis(N[2], 'F', dtype='d', domain=(0, 3))
T0 = TensorProductSpace(comm, (K0, K1, K2), axes=(0, 1, 2), slab=True)
T1 = TensorProductSpace(comm, (K0, K1, K2), axes=(1, 0, 2), slab=True)

Here the keyword slab determines that only one index set of the 3-dimensional arrays living in T0 or T1 should be distributed. The defaul is to use two, which corresponds to a so-called pencil decomposition. The axes-keyword determines the order of which transforms are conducted, starting from last to first in the given tuple. Note that T0 now will give arrays in real physical space that are distributed in the first index, whereas T1 will give arrays that are distributed in the second. This is because 0 and 1 are the first items in the tuples given to axes.

We can now create some Arrays on these spaces:

u0 = Array(T0, val=comm.Get_rank())
u1 = Array(T1, val=comm.Get_rank())

such that u0 and u1 have values corresponding to their communicating processors rank in the COMM_WORLD group (the group of all CPUs).

Note that both the TensorProductSpaces have functions with expansion

(10)\[ u(x, y, z) = \sum_{n=-N/2}^{N/2-1}\sum_{m=-N/2}^{N/2-1}\sum_{l=-N/2}^{N/2-1} \hat{u}_{l,m,n} e^{\imath (lx + my + nz)}.\]

where \(u(x, y, z)\) is the continuous solution in real physical space, and \(\hat{u}\) are the spectral expansion coefficients. If we evaluate expansion (10) on the real physical mesh, then we get

(11)\[ u(x_i, y_j, z_k) = \sum_{n=-N/2}^{N/2-1}\sum_{m=-N/2}^{N/2-1}\sum_{l=-N/2}^{N/2-1} \hat{u}_{l,m,n} e^{\imath (lx_i + my_j + nz_k)}.\]

The function \(u(x_i, y_j, z_k)\) corresponds to the arrays u0, u1, whereas we have not yet computed the array \(\hat{u}\). We could get \(\hat{u}\) as:

u0_hat = Function(T0)
u0_hat = T0.forward(u0, u0_hat)

Now, u0 and u1 have been created on the same mesh, which is a structured mesh of shape \((20, 40, 60)\). However, since they have different MPI decomposition, the values used to fill them on creation will differ. We can visualize the arrays in Paraview using some postprocessing tools, to be further described in Sec Post processing:

u0.write('myfile.h5', 'u0', 0, domain=T0.mesh())
u1.write('myfile.h5', 'u1', 0, domain=T1.mesh())
if comm.Get_rank() == 0:

And when the generated myfile.xdmf is opened in Paraview, we can see the different distributions. The function u0 is shown first, and we see that it has different values along the short first dimension. The second figure is evidently distributed along the second dimension. Both arrays are non-distributed in the third and final dimension, which is fortunate, because this axis will be the first to be transformed in, e.g., u0_hat = T0.forward(u0, u0_hat).

_images/datastructures0.png _images/datastructures1.png

We can now decide to distribute not just one, but the first two axes using a pencil decomposition instead. This is achieved simply by dropping the slab keyword:

T2 = TensorProductSpace(comm, (K0, K1, K2), axes=(0, 1, 2))
u2 = Array(T2, val=comm.Get_rank())
u2.write('pencilfile.h5', 'u2', 0)
if comm.Get_rank() == 0:

Running again with 4 CPUs the array u2 will look like:


The local slices into the global array may be obtained through:

>>> print(comm.Get_rank(), T2.local_slice(False))
0 [slice(0, 10, None), slice(0, 20, None), slice(0, 60, None)]
1 [slice(0, 10, None), slice(20, 40, None), slice(0, 60, None)]
2 [slice(10, 20, None), slice(0, 20, None), slice(0, 60, None)]
3 [slice(10, 20, None), slice(20, 40, None), slice(0, 60, None)]

In spectral space the distribution will be different. This is because the discrete Fourier transforms are performed one axis at the time, and for this to happen the dataarrays need to be realigned to get entire axis available for each processor. Naturally, for the array in the pencil example (see image), we can only perform an FFT over the third and longest axis, because only this axis is locally available to all processors. To do the other directions, the dataarray must be realigned and this is done internally by the TensorProductSpace class. The shape of the datastructure in spectral space, that is the shape of \(\hat{u}\), can be obtained as:

>>> print(comm.Get_rank(), T2.local_slice(True))
0 [slice(0, 20, None), slice(0, 20, None), slice(0, 16, None)]
1 [slice(0, 20, None), slice(0, 20, None), slice(16, 31, None)]
2 [slice(0, 20, None), slice(20, 40, None), slice(0, 16, None)]
3 [slice(0, 20, None), slice(20, 40, None), slice(16, 31, None)]

Evidently, the spectral space is distributed in the last two axes, whereas the first axis is locally avalable to all processors. Tha dataarray is said to be aligned in the first dimension.

Post processing

MPI is great because it means that you can run Shenfun on pretty much as many CPUs as you can get your hands on. However, MPI makes it more challenging to do visualization, in particular with Python and Matplotlib. For this reason there is a utilities module with helper classes for dumping dataarrays to HDF5 or NetCDF

Most of the IO has already been implemented in mpi4py-fft. The classes HDF5File and NCFile are used exactly as they are implemented in mpi4py-fft. As a common interface we provide

where ShenfunFile() returns an instance of either HDF5File or NCFile, depending on choice of backend.

For example, to create an HDF5 writer for a 3D TensorProductSpace with Fourier bases in all directions:

from shenfun import *
from mpi4py import MPI
N = (24, 25, 26)
K0 = Basis(N[0], 'F', dtype='D')
K1 = Basis(N[1], 'F', dtype='D')
K2 = Basis(N[2], 'F', dtype='d')
T = TensorProductSpace(MPI.COMM_WORLD, (K0, K1, K2))
fl = ShenfunFile('myh5file', T, backend='hdf5', mode='w')

The file instance fl will now have two method that can be used to either write dataarrays to file, or read them back again.

  • fl.write


With the HDF5 backend we can write both arrays from physical space (Array), as well as spectral space (Function). However, the NetCDF4 backend cannot handle complex dataarrays, and as such it can only be used for real physical dataarrays.

In addition to storing complete dataarrays, we can also store any slices of the arrays. To illustrate, this is how to store three snapshots of the u array, along with some global 2D and 1D slices:

u = Array(T)
u[:] = np.random.random(u.shape)
d = {'u': [u, (u, np.s_[4, :, :]), (u, np.s_[4, 4, :])]}
fl.write(0, d)
u[:] = 2
fl.write(1, d)

The ShenfunFile may also be used for the MixedTensorProductSpace, or VectorTensorProductSpace, that are collections of the scalar TensorProductSpace. We can create a MixedTensorProductSpace consisting of two TensorProductSpaces, and an accompanying writer class as:

TT = MixedTensorProductSpace([T, T])
fl_m = ShenfunFile('mixed', TT, backend='hdf5', mode='w')

Let’s now consider a transient problem where we step a solution forward in time. We create a solution array from the Array class, and update the array inside a while loop:

TT = VectorTensorProductSpace(T)
fl_m = ShenfunFile('mixed', TT, backend='hdf5', mode='w')
u = Array(TT)
tstep = 0
du = {'uv': (u,
            (u, [4, slice(None), slice(None)]),
            (u, [slice(None), 10, 10]))}
while tstep < 3:
    fl_m.write(tstep, du, forward_output=False)
    tstep += 1

Note that on each time step the arrays u, (u, [4, slice(None), slice(None)]) and (u, [slice(None), 10, 10]) are vectors, and as such of global shape (3, 24, 25, 26), (3, 25, 26) and (3, 25), respectively. However, they are stored in the hdf5 file under their spatial dimensions 1D, 2D and 3D, respectively.

Note that the slices in the above dictionaries are global views of the global arrays, that may or may not be distributed over any number of processors. Also note that these routines work with any number of CPUs, and the number of CPUs does not need to be the same when storing or retrieving the data.

After running the above, the different arrays will be found in groups stored in myyfile.h5 with directory tree structure as:

└─ u/
   ├─ 1D/
   |  └─ 4_4_slice/
   |     ├─ 0
   |     └─ 1
   ├─ 2D/
   |  └─ 4_slice_slice/
   |     ├─ 0
   |     └─ 1
   ├─ 3D/
   |  ├─ 0
   |  └─ 1
   └─ mesh/
      ├─ x0
      ├─ x1
      └─ x2

Likewise, the mixed.h5 file will at the end of the loop look like:

└─ uv/
   ├─ 1D/
   |  └─ slice_10_10/
   |     ├─ 0
   |     ├─ 1
   |     └─ 3
   ├─ 2D/
   |  └─ 4_slice_slice/
   |     ├─ 0
   |     ├─ 1
   |     └─ 3
   ├─ 3D/
   |  ├─ 0
   |  ├─ 1
   |  └─ 3
   └─ mesh/
      ├─ x0
      ├─ x1
      └─ x2

Note that the mesh is stored as well as the results. The three mesh arrays are all 1D arrays, representing the domain for each basis in the TensorProductSpace.

With NetCDF4 the layout is somewhat different. For mixed above, if we were using backend netcdf instead of hdf5, we would get a datafile where ncdump -h would result in:

netcdf mixed {
        time = UNLIMITED ; // (3 currently)
        i = 3 ;
        x = 24 ;
        y = 25 ;
        z = 26 ;
        double time(time) ;
        double i(i) ;
        double x(x) ;
        double y(y) ;
        double z(z) ;
        double uv(time, i, x, y, z) ;
        double uv_4_slice_slice(time, i, y, z) ;
        double uv_slice_10_10(time, i, x) ;

Note that it is also possible to store vector arrays as scalars. For NetCDF4 this is necessary for direct visualization using Visit. To store vectors as scalars, simply use:

fl_m.write(tstep, du, forward_output=False, as_scalar=True)


The stored datafiles can be visualized in ParaView. However, ParaView cannot understand the content of these HDF5-files without a little bit of help. We have to explain that these data-files contain structured arrays of such and such shape. The way to do this is through the simple XML descriptor XDMF. To this end there is a function imported from mpi4py-fft called generate_xdmf that can be called with any one of the generated hdf5-files:


This results in some light xdmf-files being generated for the 2D and 3D arrays in the hdf5-file:

  • myh5file.xdmf

  • myh5file_4_slice_slice.xdmf

  • mixed.xdmf

  • mixed_4_slice_slice.xdmf

These xdmf-files can be opened and inspected by ParaView. Note that 1D arrays are not wrapped, and neither are 4D.

An annoying feature of Paraview is that it views a three-dimensional array of shape \((N_0, N_1, N_2)\) as transposed compared to shenfun. That is, for Paraview the last axis represents the \(x\)-axis, whereas shenfun (like most others) considers the first axis to be the \(x\)-axis. So when opening a three-dimensional array in Paraview one needs to be aware. Especially when plotting vectors. Assume that we are working with a Navier-Stokes solver and have a three-dimensional VectorTensorProductSpace to represent the fluid velocity:

from mpi4py import MPI
from shenfun import *

N = (32, 64, 128)
V0 = Basis(N[0], 'F', dtype='D')
V1 = Basis(N[1], 'F', dtype='D')
V2 = Basis(N[2], 'F', dtype='d')
T = TensorProductSpace(comm, (V0, V1, V2))
TV = VectorTensorProductSpace(T)
U = Array(TV)
U[0] = 0
U[1] = 1
U[2] = 2

To store the resulting Array U we can create an instance of the HDF5File class, and store using keyword as_scalar=True:

hdf5file = ShenfunFile("NS", TV, backend='hdf5', mode='w')
file.write(0, {'u': [U]}, as_scalar=True)
file.write(1, {'u': [U]}, as_scalar=True)

Alternatively, one may store the arrays directly as:

U.write('U.h5', 'u', 0, domain=T.mesh(), as_scalar=True)
U.write('U.h5', 'u', 1, domain=T.mesh(), as_scalar=True)

Generate an xdmf file through:


and open the generated NS.xdmf file in Paraview. You will then see three scalar arrays u0, u1, u2, each one of shape (32, 64, 128), for the vector component in what Paraview considers the \(z\), \(y\) and \(x\) directions, respectively. Other than the swapped coordinate axes there is no difference. But be careful if creating vectors in Paraview with the Calculator. The vector should be created as: