TR-BDF2
Introduction
TR-BDF2 is a composite time-stepping scheme operating on a single step with two substeps. The first substep is relying on the implicit midpoint method (trapezoidal rule), while the second substep uses the second order backward differentiation formula, hence the name TR-BDF2. It has been first described in [Bank1985], and is sometimes called the Bathe method after [Bathe2005]. The scheme is part of the Diagonally Implicit Runge Kutta (DIRK) family of methods. Under some condition, it can also be part of the singly DIRK subset, which means that the same Newton iteration matrix can be used for both sub-steps.
Notations
Let be \(\mathbf{q}\) be a field function of time \(t\) and \(\mathbf{\dot q}, \mathbf{\ddot q}\) be its first and second derivatives with respect to time, \[ \begin{aligned} \mathbf{q} &= \mathbf{q}(t)\\ \mathbf{\dot q} &= \frac{\mathrm{d} \mathbf{q}}{\mathrm{d} t}\\ \mathbf{\ddot q} &= \frac{\mathrm{d} \mathbf{\dot q}}{\mathrm{d} t}\\ \end{aligned} \] In order to make the description of the time integration algorithms less abstract, one can think of $,, $ as position, velocity and acceleration. Moreover, the balance of momentum equation is considered for a concrete derivation of the scheme. Therefore, \[ \notag \mathbf M \mathbf{\ddot q} +\mathbf{f} = 0 \quad \text{where} \quad \mathbf{f} = \mathbf{f}\left(\mathbf{q}(t),\mathbf{\dot q}(t)\right) \] The mass matrix \(\mathbf{M}\) is assumed constant. In addition, to facilitate the notations, the superscripts involving \(t\) will be used to denote time points such that \(\mathbf{q}^{t+\tau} = \mathbf{q}(t+\tau)\) represents the discrete value of \(\mathbf{q}\) at time \(t+\tau\), while subscript involving \(n\) as in \(\mathbf{q}_n^t\) will represent the value of \(\mathbf{q}^t\) at some iteration \(n\). The symbol \(\tau\) represents the timestep.
Substeps
Implicit midpoint
The implicit midpoint method is summarized below for the sake of completeness. The particularity here is that one can choose where the intermediate discrete state vector is computed. Indeed, one can pick any \(\gamma \in ]0,1[\) and get \(\mathbf{q}^{t+\gamma\tau},\mathbf{\dot q}^{t+\gamma\tau}\), which are then used in the three-point BDF2 method. The residual is formulated as follows
\[ \begin{eqnarray*} \begin{aligned} \mathrm{R}=\begin{bmatrix}\mathrm{R}_q\\ \mathrm{R}_\dot q\end{bmatrix}= \begin{bmatrix} &\mathbf{q}^{t+\gamma\tau}-\mathbf{q}^t - \tfrac{\gamma\tau}{2}\left(\mathbf{\dot q}^{t}+\mathbf{\dot q}^{t+\gamma\tau}\right) \\ &\mathbf{M}\left(\mathbf{\dot q}^{t+\gamma\tau} - \mathbf{\dot q}^t\right) - \tfrac{\gamma\tau}{2}\left(\mathbf{f}^{t}+\mathbf{f}^{t+\gamma\tau}\right) \\ \end{bmatrix} \end{aligned}=0 \end{eqnarray*} \] Which gives the iterative procedure below \[ \begin{equation} \begin{aligned} \left(\mathbf{M}+\tfrac{\gamma\tau}{2}\mathbf{D}_n+\tfrac{(\gamma\tau)^2}{4}\mathbf{K}_n\right)\Delta\mathbf{\dot{q}}&= -\mathrm{R}_{\dot{q},n} +\tfrac{\gamma\tau}{2}\mathbf{K}\mathrm{R}_{q,n} \\ \mathbf{\dot{q}}^{t+\gamma\tau}_{n+1}&=\mathbf{\dot{q}}^{t+\gamma\tau}_n+\Delta\mathbf{\dot{q}} \\ \mathbf{q}^{t+\gamma\tau}_{n+1} &= \mathbf{q}^t+ \tfrac{\gamma\tau}{2}\left(\mathbf{\dot q}^{t}+\mathbf{\dot q}^{t+\gamma\tau}_{n+1}\right) \\ \mathbf{f}^{t+\gamma\tau}_{n+1} &= \mathbf{f}\left(\mathbf{q}^{t+\gamma\tau}_{n+1},\mathbf{\dot q}^{t+\gamma\tau}_{n+1}\right) \end{aligned} \label{eq:iteration_unrolled_midpoint} \end{equation} \]
BDF2
The BDF2 is applied using the two previous known solution at time \(t\) and \(t+\gamma\tau\). The formula is written as \[ \frac{1}{\tau}\left(\frac{2-\gamma}{1-\gamma}\mathbf{q}^{t+\tau} - \frac{1}{\gamma(1-\gamma)}\mathbf{q}^{t+\gamma\tau} + \frac{1-\gamma}{\gamma}\mathbf{q}^{t}\right) = \mathbf{\dot{q}}^{t+\tau} \] Applied to the current scenario, the residual is detailed as \[ \begin{eqnarray*} \begin{aligned} \mathrm{R}=\begin{bmatrix}\mathrm{R}_q\\ \mathrm{R}_\dot q\end{bmatrix}= \begin{bmatrix} &\mathbf{q}^{t+\tau} - \frac{1}{\gamma(2-\gamma)}\mathbf{q}^{t+\gamma\tau}+\frac{(1-\gamma)^2}{\gamma(2-\gamma)}\mathbf{q}^{t} - \tau\frac{1-\gamma}{2-\gamma}\mathbf{\dot q}^{t+\tau} \\ &\mathbf{M}\left(\mathbf{\dot q}^{t+\tau} - \frac{1}{\gamma(2-\gamma)}\mathbf{q}^{t+\gamma\tau}+\frac{(1-\gamma)^2}{\gamma(2-\gamma)}\mathbf{q}^{t}\right) + \tau\frac{1-\gamma}{2-\gamma}\mathbf{f}^{t+\tau} \end{bmatrix} \end{aligned}=0 \end{eqnarray*} \] Which gives the iterative procedure below \[ \begin{equation} \begin{aligned} \left(\mathbf{M}+\tfrac{1-\gamma}{2-\gamma}\tau\mathbf{D}_n+\tfrac{(1-\gamma)^2}{(2-\gamma)^2}\tau^2\mathbf{K}_n\right)\Delta\mathbf{\dot{q}}&= -\mathrm{R}_{\dot{q},n} +\tfrac{1-\gamma}{2-\gamma}\tau\mathbf{K}\mathrm{R}_{q,n} \\ \mathbf{\dot{q}}^{t+\tau}_{n+1}&=\mathbf{\dot{q}}^{t+\tau}_n+\Delta\mathbf{\dot{q}} \\ \mathbf{q}^{t+\tau}_{n+1} &= \frac{1}{\gamma(2-\gamma)}\mathbf{q}^{t+\gamma\tau} - \frac{(1-\gamma)^2}{\gamma(2-\gamma)}\mathbf{q}^{t} + \tau\frac{1-\gamma}{2-\gamma}\mathbf{\dot q}^{t+\tau}_{n+1} \\ \mathbf{f}^{t+\gamma\tau}_{n+1} &= \mathbf{f}\left(\mathbf{q}^{t+\tau}_{n+1},\mathbf{\dot q}^{t+\tau}_{n+1}\right) \end{aligned} \label{eq:iteration_unrolled_bdf2} \end{equation} \]
Special cases
A common and intuitive choice for \(\gamma\) has been \(\frac{1}{2}\). The two intervals of the BDF2 are then balanced and it simplifies the resulting equations as such \[ \begin{equation} \begin{aligned} \left(\mathbf{M}+\tfrac{\tau}{3}\mathbf{D}_n+(\tfrac{\tau}{3})^2\mathbf{K}_n\right)\Delta\mathbf{\dot{q}}&= -\mathrm{R}_{\dot{q},n} +\tfrac{\tau}{3}\mathbf{K}\mathrm{R}_{q,n} \\ \mathbf{\dot{q}}^{t+\tau}_{n+1}&=\mathbf{\dot{q}}^{t+\tau}_n+\Delta\mathbf{\dot{q}} \\ \mathbf{q}^{t+\tau}_{n+1} &= \tfrac{4}{3}\mathbf{q}^{t+\gamma\tau} - \tfrac{1}{3}\mathbf{q}^{t} + \tfrac{\tau}{3}\mathbf{\dot q}^{t+\tau}_{n+1} \\ \mathbf{f}^{t+\gamma\tau}_{n+1} &= \mathbf{f}\left(\mathbf{q}^{t+\tau}_{n+1},\mathbf{\dot q}^{t+\tau}_{n+1}\right) \end{aligned} \end{equation} \] An interesting choice for \(\gamma\) noted by [Bank1985] is \(2-\sqrt{2}\). This value leads to identical iteration matrix for both sub-steps, although only for linear problems, i.e. when the stiffness and damping matrix are constant. This result is obtained by taking the scaling factor in front of the damping (or stiffness) matrix from \(\eqref{eq:iteration_unrolled_midpoint}\) and \(\eqref{eq:iteration_unrolled_bdf2}\) to be equal for both substeps \[ \frac{\gamma}{2} = \frac{1-\gamma}{2-\gamma} \implies\tfrac{1}{2}\gamma^2-2\gamma+1=0 \] For which \(\gamma_1=2-\sqrt{2}\) and \(\gamma_2=\sqrt{2}+2\) are solutions of the equation. The solution \(\gamma_1\) is selected because \(\gamma_1<1\).
Which value of \(\gamma\) to choose between the two ? Well, it depends on the problem.
If the problem is linear, then the choice of \(\gamma=2-\sqrt{2}\) is optimal since it reduces the computational cost of the scheme by avoiding an additional assembly of the linear system. On the other hand, it does not really matter for non-linear problems, as demonstrated in [Bathe2007]. Indeed, an assembly of the system should theoretically be done at each Newton step until convergence. However, numerous tricks have been applied to reduce the amount of system assemblies with the aim of reducing the computational cost related to non-linear problems, where the actual effects of the choice of \(\gamma\) are less clear.
Generalization via Wilson-θ method
A generalization of the scheme can be done by replacing the first substep, the implicit midpoint, with the Wilson-θ method. It provides a new parameter to play with besides \(\gamma\).
\[ \begin{eqnarray*} \begin{aligned} \mathrm{R}=\begin{bmatrix}\mathrm{R}_q\\ \mathrm{R}_\dot q\end{bmatrix}= \begin{bmatrix} &\mathbf{q}^{t+\gamma\tau}-\mathbf{q}^t - \theta\gamma\tau\left(\mathbf{\dot q}^{t}+\mathbf{\dot q}^{t+\gamma\tau}\right) \\ &\mathbf{M}\left(\mathbf{\dot q}^{t+\gamma\tau} - \mathbf{\dot q}^t\right) - \theta\gamma\tau\left(\mathbf{f}^{t}+\mathbf{f}^{t+\gamma\tau}\right) \\ \end{bmatrix} \end{aligned}=0 \end{eqnarray*} \] Which gives the iterative procedure below \[ \begin{equation} \begin{aligned} \left(\mathbf{M}+\theta\gamma\tau\mathbf{D}_n+(\theta\gamma\tau)^2\mathbf{K}_n\right)\Delta\mathbf{\dot{q}}&= -\mathrm{R}_{\dot{q},n} +\theta\gamma\tau\mathbf{K}\mathrm{R}_{q,n} \\ \mathbf{\dot{q}}^{t+\gamma\tau}_{n+1}&=\mathbf{\dot{q}}^{t+\gamma\tau}_n+\Delta\mathbf{\dot{q}} \\ \mathbf{q}^{t+\gamma\tau}_{n+1} &= \mathbf{q}^t+ \theta\gamma\tau\left(\mathbf{\dot q}^{t}+\mathbf{\dot q}^{t+\gamma\tau}_{n+1}\right) \\ \mathbf{f}^{t+\gamma\tau}_{n+1} &= \mathbf{f}\left(\mathbf{q}^{t+\gamma\tau}_{n+1},\mathbf{\dot q}^{t+\gamma\tau}_{n+1}\right) \end{aligned} \label{eq:iteration_unrolled_theta} \end{equation} \]
The introduction of a new free parameter unlocks many possibilities. As a result, the choice of the parameter so that the iteration matrix is the same for both substeps (in the linear case) becomes flexible. Indeed, the equation is now \[ \begin{aligned} \theta\gamma=\frac{1-\gamma}{2-\gamma} \implies &\theta\gamma^2-(2\theta+1+)\gamma+1=0 \end{aligned} \] Which can be solved one way (getting \(\gamma\) from a chosen \(\theta\)) or the other (getting \(\theta\) from a chosen \(\gamma\)) \[ \begin{aligned} &\gamma_1=\frac{1+2\theta-\sqrt{4\theta^2+1}}{2\theta} & \gamma_2=\frac{1+2\theta+\sqrt{4\theta^2+1}}{2\theta} \\ &\text{or} \\ &\theta=\frac{1-\gamma}{\gamma(2-\gamma)} \end{aligned} \] Choosing \(\theta=\tfrac{1}{2}\) leads to the previously reported values of \(\gamma=2-\sqrt{2}\) or \(\gamma=2+\sqrt{2}\). However, choosing first \(\gamma=\tfrac{1}{2}\), which simplifies the scheme coefficients as described above, yields \(\theta=\tfrac{2}{3}\) to be able to reuse the same iteration matrix.
From a pure theoretical point of view, it might be interesting to find the parameters such that \(\theta=\gamma\). The equation to solve to get these is now a cubic polynomial \[ \gamma^3-2\gamma^2-2\gamma+1=0 \] which has three solutions \[ \begin{aligned} \gamma_1=&-1\\ \gamma_2=&\frac{3-\sqrt{5}}{2}\approx0.382\\ \gamma_3=&\frac{3+\sqrt{5}}{2}\approx2.618\\ \end{aligned} \]
Alternatively, the first substep could be replaced with the Newmark-β method, as in [Noh2018] or any other single step scheme.
\(\gamma\) above one and beyond
In [Noh2018], the properties of the scheme are explored for many values of \(\gamma\), but fixed values for the Newmark-β part. The chosen values of the Newmark-β part ends up yielding the implicit midpoint method, in order to have second-order accuracy for any \(\gamma\). The values for \(\gamma\) departs from the limited restricted range of [0,1]. The observations are the following:
- Amplitude decay: no decay at \(\gamma\rightarrow0\), then increases until a local maxima at \(\gamma=2-\sqrt{2}\), then decreases to no decay \(\gamma=1\), and sharply increases after for \(\gamma>1\). The decay overpasses the local maxima somewhere \(1.1<\gamma<1.15\)
- Period elongation: locally maximum elongation at \(\gamma\rightarrow0\), then decreases to a local minimum at \(\gamma=2-\sqrt{2}\), then increases again above. The elongation overpasses the local maximum when \(\gamma=1\).
- Spectral radius: shifts left for \(0<\gamma<2-\sqrt{2}\), then shifts right for \(2-\sqrt{2}<\gamma<1\), and sharply shifts left for \(\gamma>1\). The radius overpasses the spectral radius at \(\gamma=2-\sqrt{2}\) somewhere between \(1.15<\gamma<1.2\)
References
[Bank1985]
“Transient Simulation of Silicon Devices and Circuits” (1985)
[Hosea1996]
“Analysis and implementation of TR-BDF2” (1996)
[Dharmaraja2010]
“Optimal stability for trapezoidal-backward difference split-steps”
(2010)
[Bathe2005]
“On a composite implicit time integration procedure for nonlinear
dynamics” (2005)
[Bathe2007]
“Conserving energy and momentum in nonlinear dynamics: A simple implicit
time integration scheme” (2007)
[Rezaei2015] “Mixed
timestepping schemes for nonsmooth mechanics with high frequency
damping” (2015)
[Schindler2017]
“Consistent Time-Integration Schemes for Flexible Multibody Systems with
Friction and Impacts” (2017)
[Noh2018]
“Further insights into an implicit time integration scheme for
structural dynamics” (2018)
[Brown2018]
“Accurate dissipative forces in optimization integrators” (2018)