Nitsche method for weakly enforced Dirichlet boundary conditions

Disclaimer

For the sake of simplicity, the following post focuses on weak Dirichlet boundary conditions applied to the linear elastostatics problem. This document reflects my understanding and interpretation of the method. If you spot a mistake, please let me know !

Context

Numerical models of physical systems are often represented as Boundary Value Problem, which consists of a Partial Differential Equation governing the domain accompanied by boundary conditions, known values on the boundary of the domain. Boundary conditions are effectively constraints: they are values that must be enforced at some coordinates. Boundary conditions applied directly on the unknown field are called essential or Dirichlet boundary conditions. The spatial discretization required by the weak form of the problem is often chosen to match the boundary of the domain of interest. Together with an interpolating polynomial space, this enables to statically condense out known values from the resulting linear system of equations. However, this does not have to be the case, and the discretized space can cover the domain of interest. In such cases, the essential boundary conditions cannot be straightforwardly enforced.

Penalty method

The penalty method is a way to apply constraint to a system relying only on primal unknown fields. It consists in augmenting the potential energy of the system with an additional term. \[ \Pi(\mathbf{u})=\Pi_\text{system}(\mathbf{u})+\Pi_\text{penalty}(\mathbf{u}) \] The penalty term can conveniently be defined as a convex quadratic energy: \[ \Pi_\text{penalty}(\mathbf{u})=\tfrac{1}{2}\alpha\Vert g(\mathbf{u})\Vert^2 \]

where \(g\) is a scalar holonomic constraint and \(\alpha\) the penalty factor. One can recognize the analogy with the potential energy of a spring \(\Pi_\text{spring}=\tfrac{1}{2}k(l-l_0)^2\).

Unfortunately, although appealing for its simplicity, the penalty method is not consistent.

So what is consistency, or for a method to be said consistent ? It means that the method can attain the exact desired solution. The penalty method is consistent only if \(\alpha\to\infty\). However, a high “stiffness” of the constraint results in severe ill-conditioning of the numerical system. For example, let us consider the following simple mechanical system of a mass attached to a spring:

The spring represents the energy from the constraint, while the weight of the particle represents the energy of the system. One understands easily that the spring will never reach its rest length down to an arbitrarily small tolerance (satisfaction of the constraint) unless the ratio mass over constraint stiffness tend to zero, which means either no energy in the system or infinitely high constraint stiffness.

To understand the source of this issue, we need to backtrack the variational formulation of the system.

Strong form of the problem

The equations of elastostatics used in the following are \[ \begin{aligned} \nabla\cdot\boldsymbol\sigma + \mathbf{b} &=0 &&\text{in }\Omega&& \textsf{Balance of linear momentum}\\ \boldsymbol\sigma&=\mathbb{C}:\boldsymbol\varepsilon && && \textsf{Constitutive equation}\\ \boldsymbol\varepsilon &= \tfrac{1}{2}(\nabla\mathbf{u}^T+\nabla\mathbf{u}) && && \textsf{Kinematics compatibility}\\ \boldsymbol\sigma &= \boldsymbol\sigma^T && &&\textsf{Balance of angular momentum}\\ \mathbf{u}&=\mathbf{u}_d && \text{in }\Omega_d && \textsf{Dirichlet BC}\\ {\boldsymbol\sigma\cdot\mathbf{n}}&=\mathbf{t}_n &&\text{in }\Omega_n && \textsf{Neumann BC}\\ \end{aligned} \]

where \(\mathbf{u}\), the displacement, is the principal unknown of the problem.

Weak formulation

The weak form is obtained by multiplying a test function \(\mathbf{v}\) and integrating over the domain. Unlike the strong form which enforces the equation pointwise, the weak form provides a solution in an “average” sense.
\[ \begin{equation} \int_\Omega \mathbf{v}(\nabla\cdot\boldsymbol\sigma)-\int_\Omega\mathbf{v}\mathbf{b} = 0 \label{weak_balance} \end{equation} \]

Using integration by parts on the first term \(\int(ab)'=\int a'b+\int ab'\) yields \[ \int_\Omega\nabla\cdot(\mathbf{v}\boldsymbol\sigma)=\int_\Omega\mathbf{v}(\nabla\cdot\boldsymbol\sigma)+\int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma \] Using the divergence theorem, the left hand side becomes \[ \int_\Omega\nabla\cdot(\mathbf{v}\boldsymbol\sigma) = \int_{\partial\Omega}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) \] Combining both yields \[ \int_\Omega\mathbf{v}(\nabla\cdot\boldsymbol\sigma) = \int_{\partial\Omega}\mathbf{v}(\nabla\boldsymbol\sigma\cdot\mathbf{n}) -\int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma \] Inserting it in \(\eqref{weak_balance}\) leads to \[ \int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma = \int_\Omega\mathbf{v}\mathbf{b} + \int_{\partial\Omega}\mathbf{v}(\nabla\boldsymbol\sigma\cdot\mathbf{n}) \] Considering that the boundary of the domain is perfectly divided into two subsets \(\partial\Omega=\partial\Omega_d\cup\partial\Omega_n\) and \(\partial\Omega_d\cap\partial\Omega_n=\emptyset\) \[ \int_{\partial\Omega}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) = \int_{\partial\Omega_d}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) + \int_{\partial\Omega_n}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) \] The Neumann boundary conditions can be straightforwardly substituted. Incorporating in the former equation yields the weak form of the problem \[ \begin{equation} \int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma = \int_\Omega\mathbf{v}\mathbf{b} + \int_{\partial\Omega_n}\mathbf{v}\mathbf{t}_n + \int_{\partial\Omega_d}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) %+ \int_{\partial\Omega_d}\mathbf{v}(\mathbf{u}_d-\mathbf{u}) \label{weak_general} \quad\text{and}\quad \mathbf{u}=\mathbf{u}_d\;\text{in}\;\partial\Omega_d \end{equation} \] From here, the problem is often further simplified such that \(\mathbf{v}\) vanishes on \(\partial\Omega_d\) yielding \[ \begin{equation} \int_{\partial\Omega_d}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n})=0 \label{missing_term} \end{equation} \] and such that the discretization of \(\mathbf{u}\) fulfills the Kronecker-delta property, which means that the field \(\mathbf{u}\) is interpolated. This assumption implies that the Dirichlet boundary conditions \(\mathbf{u}=\mathbf{u}_d\) can be strongly enforced on the nodes of the field \(\mathbf{u}\) and statically condensed away from the resulting linear system. The reduced and often presented weak form is then \[ \begin{equation} \Pi_\text{system} = -\int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma + \int_\Omega\mathbf{v}\mathbf{b} + \int_{\partial\Omega_n}\mathbf{v}\mathbf{t}_n \quad\text{with}\quad \mathbf{u}=\mathbf{u}_d\;\text{in}\;\partial\Omega_d \label{weak_reduced} \end{equation} \] However, the rise of approximating discretization (B-Splines, RBF, etc) — as opposed to interpolating — and non-conforming discretization (material point method, finite cell method, etc) highlights the need for weakly imposed Dirichlet boundary conditions.

One can now get a better grasp why the usual penalty method lack of consistency, the term \(\int_{\partial\Omega_d}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n})\) is conveniently omitted from the standard formulation, and forgotten when it is needed back.

Nitsche method

The lack of consistency of the penalty method comes from applying it onto the simplified weak form \(\eqref{weak_reduced}\). The Nitsche method, in essence, is relatively simple: applying the penalty method to the original problem \(\eqref{weak_general}\). That means incorporating the overlooked term \(\eqref{missing_term}\). \[ \begin{equation} \int_\Omega\nabla\mathbf{v}:\boldsymbol\sigma = \int_\Omega\mathbf{v}\mathbf{b} + \int_{\partial\Omega_n}\mathbf{v}\mathbf{t}_n + \underbrace{\int_{\partial\Omega_d}\mathbf{v}(\boldsymbol\sigma\cdot\mathbf{n}) + \underbrace{\alpha\int_{\partial\Omega_d}\mathbf{v}(\mathbf{u}_d-\mathbf{u}}_\text{Penalty})}_\text{Nitsche} \label{weak_general_dirichlet} \end{equation} \]

The weak form can be made symmetric by adding the term \(\int_{\partial\Omega_d}\mathbf{u}(\boldsymbol\sigma(\mathbf{v})\cdot\mathbf{n})\)

Physical interpretation

In brief, the penalty term pushes to fulfill the constraint, while the latter term cancels any existing energy in the system that may compete. Free from resistance, the penalty term can therefore lead to the exact fulfillment of the constraint. In a simplified view, reusing the previous example of the mass-spring, this looks like:

Linear system of equations

The system of equations resulting from the linearization yields something like that \[ \left[\mathbf{K}+\mathbf{K}_\text{p}-(\mathbf{K}_\text{n}+\mathbf{K}_\text{n}^T)\right] \Delta\mathbf{q}=\mathbf{f}+\mathbf{f}_\text{p}-\mathbf{f}_\text{n} \] where \(\mathbf{K},\mathbf{K}_\text{p},\mathbf{K}_\text{n}\) denotes the stiffness matrices, where \(\mathbf{f},\mathbf{f}_\text{p},\mathbf{f}_\text{n}\) denotes the force vectors, and \(\mathbf{q}\) is the discretized displacement field. The structure of the equation highlights the interpretation given in the previous section: the stiffness/force on the boundary \((\mathbf{K}_\text{n}/\mathbf{f}_\text{n})\) is substracted from the stiffness/force within the system \((\mathbf{K},\mathbf{f})\) onto which the stiffness/force of the penalty term \((\mathbf{K}_\text{p}/\mathbf{f}_\text{p})\) is added. \[ \begin{aligned} \mathbf{K}_{\text{n},ab} &= \int_{\partial\Omega_d}\frac{\partial\mathbf{u}_i}{\partial\mathbf{q}_a}\frac{\partial\boldsymbol\sigma_{ij}}{\partial\boldsymbol\varepsilon_{kl}}\mathbf{n}_j\frac{\partial\boldsymbol\varepsilon_{kl}}{\partial\mathbf{q}_b}\\ \mathbf{K}_{\text{p},ab} &= \alpha\int_{\partial\Omega_d}\frac{\partial\mathbf{u}_i}{\partial\mathbf{q}_a}\frac{\partial\mathbf{u}_i}{\partial\mathbf{q}_b} \end{aligned} \]

Connection to Lagrange multipliers

Lagrange multipliers are an convenient way to enforce constraints consistently. Unlike the penalty and Nitsche method, it increases the size of the numerical system by adding unknowns. \[ \Pi(\mathbf{u},\boldsymbol\lambda)=\Pi_\text{system}(\mathbf{u})+\Pi_\text{lagrange}(\mathbf{u},\boldsymbol\lambda) \] where the new term, for weakly enforced boundary conditions, can be formulated as \[ \Pi_\text{lagrange}(\mathbf{u},\boldsymbol\lambda) = \int_{\partial\Omega_d}\boldsymbol\lambda(\mathbf{u}-\mathbf{u}_d) \] One can notice the correspondence with \(\eqref{missing_term}\) if \[ \boldsymbol\lambda=\boldsymbol\sigma\mathbf{n} \] Effectively, the Lagrange multiplier is an unknown in the system, and its interpretation here is the normal traction on the boundary.

Conclusion

The Nitsche method is the correct (variationally consistent) version of the penalty method, which does not omit terms from the original weak form of the problem at hand. The Nitsche method requires knowledge of the actual problem to enforce the constraints, while the Lagrange multipliers, by considering new unknowns, allows to apply constraints without knowledge of the underlying problem. Nevertheless, the problem is rarely a black box in practice so that implementation of the Nitsche method should be moderately complex. The similarity to the augmented Lagrangian method was highlighted but not detailed.

References

(2008) A gentle introduction to the Finite Element Method by Francisco–Javier Sayas.

(2008) A formulation for frictionless contact problems using a weak form introduced by Nitsche by Wriggers and Zavarise.

(2017) An overview of recent results on Nitsche’s method for contact problems by Chouly, Fabre, Hild, Mlika, Pousin, and Renard.

(2019) Weak impositions of Dirichlet boundary conditions in solid mechanics: A critique of current approaches and extension to partially prescribed boundaries by Lu, Augarde, Coombs, and Hu.