A p-multigrid compact gas-kinetic scheme for steady-state acceleration

Xing Ji xjiad@connect.ust.hk Wei Shyy weishyy@ust.hk Kun Xu makxu@ust.hk Department of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Department of Mechanical and Aerospace Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Shenzhen Research Institute, Hong Kong University of Science and Technology, Shenzhen, China

Abstract

In this paper, the high-order compact gas-kinetic scheme (CGKS) on three-dimensional hybrid unstructured mesh is further developed with the p-multigrid technique for steady-state solution acceleration. The p-multigrid strategy is a two-level algorithm. On the high-order level, the high-order CGKS is used to evolve both cell-averaged conservative flow variables and their gradients under high-order compact initial reconstruction at the beginning of next time step. On the low-order level, starting from the high-order level solution the cell-averaged conservative flow variables is evolved by a first-order scheme, where implicit backward Euler smoother is adopted for accelerating the convergence of steady-state solution. The final iterative updating scheme becomes numerically simple and computationally efficient. The effectiveness of the p-multigrid method is validated in both subsonic and supersonic flow simulations in two- and three-dimensional space with hybrid unstructured mesh. One order of magnitude speedup in the convergence rate has be achieved by the approach in comparison with the explicit counterpart.

keywords:

compact gas-kinetic scheme, p-multigrid, Navier-Stokes solution, hybrid mesh

1 Introduction

In the past decades, the development of high-order methods for computational fluid dynamics has made great progress [33]. The representative algorithms include the weighted essentially non-oscillatory (WENO) method [1], the Discontinuous Galerkin (DG) method [10], the reconstructed-DG (rDG) method [20], the boundary variation diminishing (BVD) method [27], and the variational finite volume (VFV) method [32].

In recent years, a class of high-order compact gas-kinetic scheme (CGKS) [23, 15] has been developed from the second-order gas-kinetic scheme (GKS) [34]. The time-dependent interface evolution solution in CGKS is based on an analytical integral solution of the Bhatnagar–Gross–Krook equation [5], which recovers a transition process from the kinetic scale flux vector splitting to the central difference Lax-Wendroff type discretization. The time-accurate gas distribution function can be used to evaluate both flux function across a cell interface and the time accurate flow variables at the cell interface. As a result, besides updating the cell-averaged flow variables as a conventional finite volume scheme, the cell-averaged gradients of flow variables can be updated simultaneously through the Gauss-Green theorem. Based on the flow variables and their gradients, a compact high-order reconstruction scheme can be properly designed. Another distinguishable feature of GKS is the use of the explicit two-stage fourth-order temporal discretization (S2O4) or the multi-stage multi-derivative time marching scheme for the high-order temporal discretization [18, 26]. In comparison with the fourth-order four-stage Runge-Kutta (RK) method, the S2O4 is more efficient with only two stages [14]. The limiting process for the possible discontinuous flux function in time has been studied as well in the CGKS [43]. Due to the physically reliable evolution process and efficient time discretization, the CGKS achieves remarkable success for unsteady compressible flow simulation, such as in computational aeroacoustics [41] and implicit large eddy simulation [14]. The CGKS has been constructed up to the eighth-order of accuracy in space on structured mesh [40], and fourth-order accuracy on triangular mesh with the possible use of a large CFL number, such as $CFL\approx 0.8$ [42, 43]. The computation of hypersonic flow around a space vehicle on 3-D hybrid mesh demonstrates its robustness in flow simulation with complex geometry [13]. However, in large-scale simulation, the mesh size around the boundary and far field can be different in thousands of times. The rate of convergence for the steady state can be extremely slow using the explicit time marching approach. Thus, it becomes necessary to develop the acceleration technique for CGKS.

The implicit temporal discretization and multigrid strategy are two popular and efficient techniques for speeding up convergence. Implicit high-order methods have been developed based on the lower-upper symmetric Gauss-Seidel (LU-SGS) method [1] and Generalized minimal residual (GMRES) method [36]. The implicit methods for GKS and unified GKS (UGKS) have also been constructed [28, 17, 44]. These methods solve a linear system of equations derived from the linearization of a fully implicit scheme in each iteration. However, due to the complicated nonlinear evolution model in updating cell interface values, for the high-order compact GKS it is not trivial to derive the linearized Jacobian matrix for the cell-averaged gradients in the development of implicit CGKS. On the other hand, a series of p-multigrid methods have been developed for compact methods with multiple degrees of freedom in each cell [25, 21, 2, 19]. The discretized system is solved by iterating the approximate solutions on different order accuracy recursively. For instance, to solve a DG scheme with polynomial order of 2 inside each element, a three-level algorithm with polynomial orders of $p$ =2, 1, 0 can be adopted and the solution is iterated on each level accuracy. The implicit and low-storage iterative method can be used on the low-order level, e.g., $p$ =0, to eliminate the low-frequency error efficiently. The explicit time marching method is used on the high-order level, e.g., $p$ =2, and avoids the expensive memory consumption of the Jacobian matrix evaluation for high-order solution [21].

In this paper, an efficient and low-storage p-multigrid method is developed for accelerating the CGKS in steady-state simulation. Following the idea in [21], a two-level V-cycle algorithm is designed. On the high-order level, the original explicit CGKS is adopted and both the cell-averaged conservative flow variables their gradients, i.e., $\overline{\textbf{W}}$ and $\overline{\textbf{W}}_{x_{i}}$ , $i=1,2,3$ , are updated. Then the flow solution and residual is transferred to the low-order solution, i.e., a piece-wise constant approximation inside each control volume. On the low-order level, only the cell-averaged conservative flow variables are smoothed, evolved, and transferred back to the high-order level. The first-order flux function, such as the Riemann solution, is applied in the low-order level and the implicit matrix-free point-relaxation method with local time stepping is used in the time integration [37]. The similarity and differences between the p-multigrid method for the CGKS and the DG scheme [21] are as follows.
i) With $\overline{\textbf{W}}$ and $\overline{\textbf{W}}_{x_{i}}$ , the CGKS has a similar memory cost as the $P1$ -DG case on high-order approximation level.
ii) Since $\overline{\textbf{W}}_{x_{i}}$ takes no effect on the transformation of $\overline{\textbf{W}}$ between the two levels, the restriction and prolongation operator for CGKS are very simple, i.e., directly copy the values of $\overline{\textbf{W}}$ and the corresponding residuals.
iii) The explicit multi-stage RK method is adopted for the DG scheme while the single-step second-order time-marching method can be used for the CGKS.
The computational cost between the explicit and p-multigrid iteration is compared through the computations for nearly incompressible and supersonic flows. Moreover, the influence of different fluxes on the low-order level is investigated. The kinetic vector flux splitting solver and classical approximate Riemann solver are compared. Due to the implicit method on the low-order level, the parallel performance for the scheme will be evaluated.

This paper is organized as follows. The basic ingredients of the high-order CGKS on mixed-element mesh are introduced in Section 2. In Section 3, a p-multigrid approach for CGKS is presented. Numerical examples including both inviscid and viscous flow computations are given in Section 4. The last section is the conclusion.

2 Compact gas-kinetic scheme framework

The 3-D gas-kinetic BGK equation [5] is

f_{t}+\textbf{u}\cdot\nabla f=\frac{g-f}{\tau},

(1)

where $f=f(\textbf{x},t,\textbf{u},\xi)$ is the gas distribution function, which is a function of space x, time $t$ , particle velocity u, and internal variable $\xi$ . $g$ is the equilibrium state approached by $f$ and $\tau$ is the collision time.

The collision term satisfies the compatibility condition

\int\frac{g-f}{\tau}\boldsymbol{\psi}\text{d}\Xi=0,

where $\boldsymbol{\psi}=(1,\textbf{u},\displaystyle\frac{1}{2}(\textbf{u}^{2}+\xi^{2}))^{T}$ , $\text{d}\Xi=\text{d}u_{1}\text{d}u_{2}\text{d}u_{3}\text{d}\xi_{1}...\text{d}\xi_{K}$ , $K$ is the number of internal degrees of freedom, i.e. $K=(5-3\gamma)/(\gamma-1)$ in 3-D case, and $\gamma$ is the specific heat ratio.

In the continuum flow regime with the smoothness assumption, based on the Chapman-Enskog expansion the gas distribution function can be expressed as [35] ,

\displaystyle f=g-\tau D_{\textbf{u}}g+\tau D_{\textbf{u}}(\tau D_{\textbf{u}})g-\tau D_{\textbf{u}}[\tau D_{\textbf{u}}(\tau D_{\textbf{u}})g]+...,

where $D_{\textbf{u}}={\partial}/{\partial t}+\textbf{u}\cdot\nabla$ . Different hydrodynamic equations can be derived by truncating on different orders of $\tau$ . With the first-order truncation, i.e.,

\displaystyle f=g-\tau(\textbf{u}\cdot\nabla g+g_{t}),

the N-S equations can be obtained,

\begin{split}\textbf{W}_{t}+\nabla\cdot\textbf{F}(\textbf{W},\nabla\textbf{W})=0,\end{split}

with $\tau=\mu/p$ and $Pr=1$ .

The conservative flow variables and their fluxes are the moments of the gas distribution function

\displaystyle\textbf{W}(\textbf{x},t)=\int\boldsymbol{\psi}f(\textbf{x},t,\textbf{u},\xi)\text{d}\Xi,

(2)

and

\textbf{F}(\textbf{x},t)=\int\textbf{u}\boldsymbol{\psi}f(\textbf{x},t,\textbf{u},\xi)\text{d}\Xi.

(3)

2.1 Finite volume discretization on hybrid grids

For a 3-D polyhedral cell $\Omega_{i}$ , the boundary can be expressed as

\partial\Omega_{i}=\bigcup_{p=1}^{N_{f}}\Gamma_{ip},

where $N_{f}$ is the number of cell interfaces for cell $\Omega_{i}$ . $N_{f}=4$ for tetrahedron, $N_{f}=5$ for prism and pyramid, $N_{f}=6$ for hexahedron. The semi-discretized form of finite volume method for conservation laws can be written as

\frac{\text{d}\textbf{W}_{i}}{\text{d}t}=\mathcal{L}(\textbf{W}_{i})=-\frac{1}{\left|\Omega_{i}\right|}\sum_{p=1}^{N_{f}}\int_{\Gamma_{ip}}\textbf{F}(\textbf{W}(\textbf{x},t))\cdot\textbf{n}_{p}\text{d}s,

(4)

with

\textbf{F}(\textbf{W}(\textbf{x},t))\cdot\textbf{n}_{p}=\int\boldsymbol{\psi}f(\textbf{x},t,\textbf{u},\xi)\textbf{u}\cdot\textbf{n}_{p}\text{d}\Xi,

where $\textbf{W}_{i}$ is the cell averaged values over cell $\Omega_{i}$ , $\left|\Omega_{i}\right|$ is the volume of $\Omega_{i}$ , F is the interface fluxes, and $\textbf{n}_{p}=(n_{1},n_{2},n_{3})^{T}$ is the unit vector representing the outer normal direction of $\Gamma_{ip}$ . Through the iso-parametric transformation, the Gaussian quadrature points can be determined and $\textbf{F}_{ip}(t)$ can be approximated by the numerical quadrature

\sum_{p=1}^{N_{f}}\int_{\Gamma_{ip}}\textbf{F}(\textbf{W}(\textbf{x},t))\cdot\textbf{n}_{p}\text{d}s=\left|\Gamma_{ip}\right|\sum_{k=1}^{M}\omega_{k}\textbf{F}(\textbf{x}_{p,k},t)\cdot\textbf{n}_{p}.

In this work, the linear element is considered. To meet the requirement of a third-order spatial accuracy, three Gaussian points are used for a triangular face and four Gaussian points are used for a quadrilateral face. In the computation, the fluxes are obtained under the local coordinates. The details can be found in [24, 16].

2.2 Gas-kinetic solver

Based on the integral solution of BGK equation, a second-order time accurate gas distribution function at a local Gaussian point $\textbf{x}=(0,0,0)$ is constructed as

$\displaystyle f(\textbf{0},t,\textbf{u},\xi)=$	$\displaystyle(1-e^{-t/\tau_{n}})g^{c}+[(t+\tau)e^{-t/\tau_{n}}-\tau]a_{x_{i}}^{c}u_{i}g^{c}+(t-\tau+\tau e^{-t/\tau_{n}})A^{c}g^{c}$
$\displaystyle+$	$\displaystyle e^{-t/\tau_{n}}g^{l}[1-(\tau+t)a_{x_{i}}^{l}u_{i}-\tau A^{l}]H(u_{1})$
$\displaystyle+$	$\displaystyle e^{-t/\tau_{n}}g^{r}[1-(\tau+t)a_{x_{i}}^{r}u_{i}-\tau A^{r}](1-H(u_{1})).$	(5)

The superscripts $l,r$ represent the initial gas distribution function $f_{0}$ at the left and right sides of a cell interface. The superscript $c$ is the corresponding equilibrium state $g$ in space and time. The integral solution basically states a physical process from the particle free transport in $f_{0}$ in the kinetic scale to the hydrodynamic flow evolution in the integral of $g$ term. The flow evolution at the cell interface depends on the ratio of time step to the local particle collision time $\Delta t/\tau$ .

The $g^{k},~{}k=l,r$ has a form of a Maxwell distribution

\displaystyle g^{k}=\rho^{k}(\frac{\lambda^{k}}{\pi})e^{-\lambda^{k}((u_{i}-U_{i})^{2}+\xi^{2})},

which can be fully determined from the macroscopic variables $\textbf{W}^{l},\textbf{W}^{r}$ through spatial reconstruction

\displaystyle\int\boldsymbol{\psi}g^{l}\text{d}\Xi=\textbf{W}^{l},\int\boldsymbol{\psi}g^{r}\text{d}\Xi=\textbf{W}^{r}.

The spatial and temporal microscopic derivatives are denoted as

\displaystyle a_{x_{i}}\equiv(\partial g/\partial x_{i})/g=g_{x_{i}}/g,A\equiv(\partial g/\partial t)/g=g_{t}/g,

which is determined by the spatial derivatives of macroscopic flow variables and the compatibility condition as follows

	$\displaystyle\langle a_{x_{1}}\rangle=\frac{\partial\textbf{W}}{\partial x_{1}}=\textbf{W}_{x_{1}},\langle a_{x_{2}}\rangle=\frac{\partial\textbf{W}}{\partial x_{2}}=\textbf{W}_{x_{2}},\langle a_{x_{3}}\rangle=\frac{\partial\textbf{W}}{\partial x_{3}}=\textbf{W}_{x_{3}},$
	$\displaystyle\langle A+a_{x_{1}}u_{1}+a_{x_{2}}u_{2}+a_{x_{3}}u_{3}\rangle=0,$

where $\left\langle...\right\rangle$ are the moments of a gas distribution function defined by

\displaystyle\langle(...)\rangle=\int\boldsymbol{\psi}(...)g\text{d}\Xi.

Similarly, the equilibrium state $g^{c}$ and its derivatives $a_{x_{i}}^{c},A_{x_{i}}^{c}$ are determined by corresponding $\textbf{W}^{c},\textbf{W}^{c}_{x_{i}}$ . The details for the compact reconstruction of $\textbf{W}^{l,r,c},\textbf{W}^{l,r,c}_{x_{i}}$ can be found in [13]. The details for calculating each term in the distribution from the corresponding macroscopic flow variable can refer to [35].

For smooth flow, the time dependent solution in Eq. (2.2) can be simplified as [34],

\displaystyle f(\textbf{0},t,\textbf{u},\xi)=g^{c}-\tau(a^{c}_{x_{i}}u_{i}+A^{c})g^{c}+A^{c}g^{c}t,

(6)

under the assumptions of $g^{l,r}=g^{c}$ , $a^{l,r}_{x_{i}}=a^{c}_{x_{i}}$ . The above gas-kinetic solver for smooth flow has less numerical dissipations than the full GKS solver in Eq. (2.2). In smooth flow region, the collision time is determined by

\displaystyle\tau=\mu/p,

where $\mu$ is the dynamic viscosity coefficient and $p$ is the pressure at the cell interface. In order to properly capture the un-resolved discontinuities, additional numerical dissipation is needed. The physical collision time $\tau$ in the exponential function part can be replaced by a numerical collision time $\tau_{n}$ . The same $\tau_{n}$ as that in [14] is adopted in this work.

2.3 Direct evolution of the cell averaged first-order spatial derivatives

As shown in Eq. (2.2), a time evolution solution at a cell interface is provided by the gas-kinetic solver, which is distinguished from the Riemann solvers with a constant solution. Recall Eq.(2), the conservative variables at the Gaussian point $\textbf{x}_{p,k}$ can be updated through the moments $\boldsymbol{\psi}$ of the gas distribution function,

\displaystyle\textbf{W}_{p,k}(t^{n+1})=\int\boldsymbol{\psi}f^{n}(\textbf{x}_{p,k},t^{n+1},\textbf{u},\xi)\text{d}\Xi,~{}k=1,...,M.

Then, the cell-averaged first-order derivatives within each element at $t^{n+1}$ can be evaluated based on the divergence theorem,

\displaystyle\nabla\overline{W}^{n+1}\left|\Omega\right|

\displaystyle=\int_{\Omega}\nabla\overline{W}(t^{n+1})\text{d}V=\int_{\partial\Omega}\overline{W}(t^{n+1})\textbf{n}\text{d}S=\sum_{p=1}^{N_{f}}\sum_{k=1}^{M_{p}}\omega_{p,k}W^{n+1}_{p,k}\textbf{n}_{p,k}\Delta S_{p},

where $\textbf{n}_{p,k}=((n_{1})_{p,k},(n_{2})_{p,k},(n_{3})_{p,k})$ is the outer unit normal direction at each Gaussian point $\textbf{x}_{p,k}$ .

2.4 Explicit one-step temporal discretization

The one-step second-order (S1O2) temporal discretization is adopted here for the steady solution. Following the definition of Eq.(4), a second-order time-accurate solution for cell-averaged conservative flow variables $\textbf{W}_{i}$ are updated by

\displaystyle\textbf{W}_{i}^{n+1}=\textbf{W}_{i}^{n}+\Delta t\mathcal{L}(\textbf{W}_{i}^{n})+\frac{1}{2}\Delta t^{2}\frac{\partial}{\partial t}\mathcal{L}(\textbf{W}_{i}^{n}),

where $\mathcal{L}(\textbf{W}_{i}^{n})$ and $\frac{\partial}{\partial t}\mathcal{L}(\textbf{W}_{i}^{n})$ are given by

	$\displaystyle\mathcal{L}(\textbf{W}_{i}^{n})$	$\displaystyle=-\frac{1}{\left\|\Omega_{i}\right\|}\sum_{p=1}^{N_{f}}\sum_{k=1}^{M}\omega_{p,k}\textbf{F}(\textbf{x}_{p,k},t_{n})\cdot\textbf{n}_{p,k},$
	$\displaystyle\frac{\partial}{\partial t}\mathcal{L}(\textbf{W}_{i}^{n})$	$\displaystyle=-\frac{1}{\left\|\Omega_{i}\right\|}\sum_{p=1}^{N_{f}}\sum_{k=1}^{M}\omega_{p,k}\partial_{t}\textbf{F}(\textbf{x}_{p,k},t_{n})\cdot\textbf{n}_{p,k}.$

The time dependent gas distribution function at a cell interface is updated in a similar way,

\begin{split}f^{n+1}=f^{n}+\Delta tf_{t}^{n}.\end{split}

The $f^{n+1}$ are fully determined by Eq. (2.2) or Eq.(6) and the macroscopic flow variables and their fluxes at the cell interface can be obtained simultaneously by Eq. (2) and Eq. (3). The details can be found in [15].

3 P-multigrid for compact gas-kinetic scheme

A two-level V-cycle p-multigrid algorithm is used to accelerate CGKS for steady-state solutions. On the first level, i.e., the high-order level, the explicit CGKS is applied. On the second level, i.e., the low-order level, the first-order finite volume scheme is used to eliminate the long-wavelength errors efficiently. Specifically, the algorithm adopted in this work is summarized as follows:

1.

Perform a time-step on the high-order level, obtain the solution $\textbf{W}^{n+1}_{p_{1}}$
2.

Transfer the solution $\textbf{W}^{n+1}_{p_{1}}$ and the correspoding residual to the low-order level: piece-wise constant $p_{0}$ . Since the cell averaged $\bar{\textbf{W}}^{n+1}$ is independent from the cell averaged slopes $\bar{\textbf{W}}_{x_{i}}^{n+1},~{}i=1,2,3$ , the solution on the low-order level becomes

$\displaystyle\textbf{W}_{p_{0}}=\bar{\textbf{W}}^{n+1}.$

The residual is the net fluxes for updating $\bar{\textbf{W}}^{n+1}$ ,

$\displaystyle\textbf{R}(\textbf{W}^{n+1}_{p_{1}})=\sum\frac{1}{\Delta t}\int_{t^{n+1}}^{t^{n+1}+\Delta t}\textbf{F}(t)\text{d}t.$

The residual is transferred to the low-order level

$\displaystyle\textbf{R}_{p_{0}}=\textbf{R}(\textbf{W}^{n+1}_{p_{1}}).$
3.

Compute the force terms on the low-order level,

$\displaystyle\textbf{S}_{p_{0}}=\textbf{R}_{p_{0}}-\textbf{R}(\textbf{W}_{p_{0}}),$

where $\textbf{R}(\textbf{W}_{p_{0}})$ is computed from the first-order scheme. Since the gas-kinetic solver is used on the high-order level, it is physically consistent to use a kinetic-type solver on the low-order level. In this paper, the first-order kinetic flux vector splitting (KFVS) scheme is used if no specified. In addition, the results by the approximate Riemann solver derived from the hydrodynamic equations are also presented to investigate the influence of difference fluxes on the low-order level to the convergence performance.
4.

Solve the steady equation

$\displaystyle\textbf{R}(\tilde{\textbf{W}}_{p_{0}})+\textbf{S}_{p_{0}}=0.$ (7)

Interpolate the correction $\textbf{C}_{p_{0}}$ from the low-order level back to the high-order level, and update the solution

	$\displaystyle\textbf{C}_{p_{0}}=\tilde{\textbf{W}}_{p_{0}}-\textbf{W}_{p_{0}},$
	$\displaystyle\tilde{\bar{\textbf{W}}}^{n+1}=\bar{\textbf{W}}^{n+1}+\textbf{C}_{p_{0}}.$

In practice, the explicit iteration for Eq. (7) brings very limited speedup for convergence. Similar conclusions have been drawn in the p-multigrid DG or CPR methods [21, 19]. Instead, the implicit backward Euler method is used as the time discretization, which is introduced as follows.

Discretize the conservation laws with source term by first-order backward Euler in time,

|\Omega_{i}|\frac{\overline{\mathbf{U}_{i}}^{n+1}-\overline{\mathbf{U}_{i}}^{n}}{\Delta t}+\oint_{\partial\Omega}(\mathbf{F}_{n}^{n+1}-\mathbf{F}_{n}^{n})\text{d}s=-\oint_{\partial\Omega}\mathbf{F}_{n}^{n}\text{d}s+|\Omega_{i}|\mathbf{S}_{n}:=\mathbf{R}_{i}^{n}

where $\mathbf{F}_{n}=\mathbf{F}\cdot\textbf{n}$ , n is the outer unit normal direction for each interface, and $\mathbf{F}_{n}^{n}$ is the explicit flux at $t^{n}$ . Denote $\Delta{\mathbf{U}}=\overline{\mathbf{U}}^{n+1}-\overline{\mathbf{U}}^{n}$ , the equation is simplified as

|\Omega_{i}|\frac{\Delta{\mathbf{U}_{i}}}{\Delta t}+\oint_{\partial\Omega}(\mathbf{F}_{n}^{n+1}-\mathbf{F}_{n}^{n})\text{d}s=\mathbf{R}_{i}^{n}

(8)

Linearize the term by first-order Taylor expansion

\mathbf{F}_{n}^{n+1}-\mathbf{F}_{n}^{n}\approx\left(\frac{\partial\mathbf{F}_{n}}{\partial\mathbf{U}}\right)^{n}\left(\mathbf{U}^{n+1}-\mathbf{U}^{n}\right)=\mathbf{A}^{n}\Delta{\mathbf{U}},

where $\mathbf{A}=\left(\frac{\partial\mathbf{F}_{n}}{\partial\mathbf{U}}\right)=\left(\frac{\partial(\mathbf{F}\cdot\textbf{n})}{\partial\mathbf{U}}\right)$ is the Jacobian matrix of the normal flux. The Jacobian matrix of the first-order L-F flux is adopted, which is given by

\begin{split}&(\mathbf{A}\Delta{\mathbf{U}})_{i,p}=\mathbf{A}_{i,p}^{+}\Delta{\mathbf{U}}_{i}+\mathbf{A}_{i,p}^{-}\Delta{\mathbf{U}}_{i,p},\\ &\mathbf{A}^{\pm}=\frac{1}{2}\left(\mathbf{A}\pm\lambda\mathbf{I}\right),\end{split}

where $\Delta{\mathbf{U}}_{i,p}$ is the increment of the conservative variables in the neighboring cell sharing the same interface $\Gamma_{ip}$ . In the current computation, the modified spectral radius $\lambda$ is adopted

\begin{split}\lambda=|U_{n}|+c+2\mu\frac{S}{|\Omega|}.\end{split}

Then we have

\begin{array}[]{l}\oint_{\partial\Omega}\mathbf{A}^{n}\Delta{\mathbf{U}}\textbf{d}s=\sum_{p=1}^{N_{f}}\left(\mathbf{A}_{i}^{+}\Delta{\mathbf{U}}_{i}+\mathbf{A}_{i,p}^{-}\Delta{\mathbf{U}}_{i,p}\right)S_{i,p}.\end{array}

Substitute it into Eq. (8), we have

\begin{array}[]{l}\Delta{\mathbf{U}}_{i}\left[\frac{|\Omega_{i}|}{\Delta t}+\frac{1}{2}\sum_{p=1}^{N_{f}}{(\lambda S)}_{i,p}\right]+\sum_{p=1}^{N_{f}}\mathbf{A}_{i,p}^{-}\Delta{\mathbf{U}}_{i,p}S_{i,p}=\mathbf{R}_{i,j}^{n},\end{array}

(9)

where

{\lambda}_{i+1/2,j}=\text{Max}({\lambda}_{i},{\lambda}_{i,p}).

The delta flux $\mathbf{A}_{i,p}^{-}\Delta{\mathbf{U}}_{i,p}$ is approximated as

\mathbf{A}_{i,p}^{-}\Delta{\mathbf{U}}_{i,p}\approx\Delta(\mathbf{F}_{n})_{i,p}^{-}=\mathbf{F}_{n}^{-}\left(\mathbf{U}^{n}_{i,p}+\Delta{\mathbf{U}}_{i,p}\right)-\mathbf{F}_{n}^{-}\left(\mathbf{U}^{n}_{i,p}\right).

For high Reynolds number flow under highly stretched mesh, the convergence rate can be further accelerated if the local time step for each cell is used. Eq. (9) is rewritten as

\begin{array}[]{l}\Delta{\mathbf{U}}_{i}\left[\frac{|\Omega_{i}|}{\Delta t_{i}}+\frac{1}{2}\sum_{p=1}^{N_{f}}{(\lambda S)}_{i,p}\right]+\sum_{p=1}^{N_{f}}[\mathbf{F}_{n}^{-}\left(\mathbf{U}^{n}_{i,p}+\Delta{\mathbf{U}}_{i,p}\right)-\mathbf{F}_{n}^{-}\left(\mathbf{U}^{n}_{i,p}\right)]=\mathbf{R}_{i,j}^{n},\end{array}

(10)

where $\Delta t_{i}$ is the local time step. To solve the solution matrix formed by Eq. (9), efficient matrix-free iteration methods, such as the LU-SGS method [11] and point relaxation method [37], can be applied.

In this work, the point relaxation method is applied as shown in algorithm 1. Six implicit steps are performed on the low-order level. In each step, the point relaxation method with two sweeps is used. The OPENMP is applied to achieve the thread-level parallelism. For n-thread parallel computation, each sweep for the point-relaxation method is simply divided into n sub-sweeps by the element index. If the series sweeps are for all the elements from 1 to $N_{e}$ , the parallel sweep for the ith thread is simply from $\frac{i-1}{n}N_{e}$ to $\frac{i}{n}N_{e}$ . Since the communications are non-blocking among each thread, the results will be slightly different for each running.

Input: Sweep number

N_{s}

1 Initialization:

\Delta\mathbf{U}_{i}=0,i=0,...,N_{e}

, where

N_{e}

is the total cell number;

2 for j=1,.., $N_{s}$ do

3 for i=1,.., $N_{e}$ do

4 Update

\Delta\mathbf{U}_{i}

by Eq. (10);

6 end for

7 for i= $N_{e}$ ,..,1 do

8 Update

\Delta\mathbf{U}_{i}

by Eq. (10);

10 end for

12 end for

Update

\mathbf{U}_{i}^{n+1}=\mathbf{U}_{i}^{n}+\Delta\mathbf{U}_{i}

Algorithm 1 Point relaxation method

4 Numerical examples

In this section, numerical tests will be presented to validate the proposed scheme. All the simulations are performed on 3-D mesh. For 2-D test cases, two layers of mesh are generated in z-direction. The explicit CFL number for the high-order level is taken as 0.3 $\sim$ 0.5 while the local CFL number for the low-order level is taken as 1000. For subsonic cases in this work, the flow is quite smooth, so the compact reconstruction with only linear weights and the simplified smooth flux in Eq. (6) are adopted for the CGKS. For the transonic and supersonic cases, the reconstruction with non-linear WENO-type weights and the full flux in Eq. (2.2) are used. If no specified, the reconstruction will be performed on the conservative variables, the KFVS solver and OPENMP parallelism with 8 thread will be adopted on the low-order level. The non-dimensional density residual is mainly presented to measure the error level, which is given by

\displaystyle Res=\frac{\sum_{1}^{N_{c}}|\rho^{n+1}_{i}-\rho^{n}_{i}|}{\sum_{1}^{N_{c}}\rho^{n}_{i}},

where $N_{c}$ is the total cell number.

4.1 Subsonic circular cylinder

A subsonic flow around a circular cylinder is simulated. The Mach number is Ma=0.15 and Reynolds number Re=40 are based on the diameter of the cylinder $D=1$ . Two sets of mesh, namely Mesh I and Mesh II are used to evaluate the performance of the p-multigrid CGKS. Mesh I is shown in Fig. 1 with a near wall size $h=1/20$ . A finer mesh, Mesh II is shown in in Fig. 2 with a near wall size $h=1/96$ . A separation bubble is formed behind the cylinder, which is steady and symmetrical. The CPU time history of the density residual is plotted in Fig. 3. The residuals for both explicit and p-multigrid CGKS on Mesh I settle down to machine zero and the speedup for the p-multigrid CGKS is about 4. The residual for p-multigrid CGKS on Mesh II reduces to machine zero smoothly, while the explicit counterpart might need far more steps. In this case, the speedup for the p-multigrid CGKS is about 8 for the same residual $10^{-9}$ . Then, the influence of different flux solvers used on the low-order level is investigated, as shown in Fig. 4. Very close results are obtained for the KFVS solver and the Lax–Friedrichs (L-F) solver under both meshes. It indicates that the usage of either kinetic-type or hydrodynamic-type solver on the low-order level has little effect on the performance of the proposed scheme. The possible explanation is that the first-order scheme on the low-order level is much more dissipative than the CGKS on the high-order level, regardless of the choice of solvers. Thirdly, the performance of the parallel computational efficiency is evaluated. Eight threads are used for the parallel case. It is exciting that the almost identical results are obtained for both series and parallel computation, as shown in Fig. 5, which means no more iterations are required for parallel computation and the simple parallel strategy adopted here is successful. The underlying reason might come to the sufficient number of sub-iterating steps on the low-order level.

To validate the high resolution of the CGKS, the quantitative results under Mesh II including the drag and lift coefficients Cd, Cl, the wake length $L$ , and the separation angle $\theta$ , etc are listed in Tab. 1, which agree well with the experimental and numerical references [30, 8, 38]. Furthermore, the quantities on the cylinder surface are extracted, including the surface pressure coefficient Cp $=\frac{p-p_{\infty}}{\frac{1}{2}\rho_{\infty}U_{\infty}^{2}}$ and the non-dimensional local tangential velocity gradient $\frac{2U_{\infty}}{D}\frac{\partial U_{\tau}}{\partial\eta}$ , as shown in Fig. 6. The Cp from the current CGKS matches nicely with the experimental data [9] and the analytical solution [4]. The tangential velocity gradient obtained by the current scheme is compared with those by the finite difference method [6] and the direct DG method [38].

Refer to caption — Figure 1: Circular cylinder: Re=40. Left: Mesh I with $4725\times 2$ hexahedral cells. Right: Local mesh distribution around cylinder colored by pressure and streamline.

Case Cd Cl Wake Length Vortex Height Vortex Width Separation angle Experiment [30] 1.46 - 1.56 – – – –1 – Experiment [8] – – 2.12 0.297 0.751 53.5^∘ DDG[38] 1.529 – 2.31 – – – current 1.525 3.3e-14 2.22 0.296 0.714 53.3^∘

Table 1: Comparison of results for steady flow passing through a circular cylinder. The results are obtained under Mesh II.

4.2 Subsonic NACA0012 airfoil

(a) Inviscid case

The inviscid flow passing through a NACA0012 airfoil with incoming Mach number Ma=0.5 and angle of attack (AOA) $\alpha=2.0^{\circ}$ is simulated first. The reflective boundary condition is imposed on the airfoil surface and the non-reflecting boundary condition based on the Riemann invariants is adopted on the far field. Total $6538\times 2$ hybrid prismatic cells are used in a cuboid domain $[-15,15]\times[15,15]\times[0,0.1]$ . The local unstructured mesh is shown in Fig. 7, which is colored by the computed pressure distributions. Almost identical Mach distributions are obtained for both explicit and p-multigrid CGKS, as given in Fig. 8. The CPU time history of density residuals in Fig. 9 shows that the residual of p-multigrid CGKS can drop to $10^{-8}$ at around 350 seconds while the explicit CGKS needs about 4500 seconds. A 13 times speedup is achieved in this case.

(a) Viscous case

A viscous flow passing through a NACA0012 airfoil is simulated then. The Reynolds number is Re=5000 based on the chord length $L=1$ . The incoming Mach number is Ma=0.5 and the AOA is 0^∘. The non-slip adiabatic boundary condition is imposed on the airfoil surface. Unstructured mixed-element mesh with $7922\times 2$ cells is used with a near wall size $h=4\times 10^{-4}$ . The maximum aspect ratio of cells is about 62. The local enlargements of the mesh are presented in Fig. 10. For this case, the speedup is about 15 at a residual level of $10^{-8}$ , as shown in Fig. 11. Computed pressure and Mach distributions obtained by the current scheme are shown in Fig. 10 and Fig. 12. Quantitative results including the surface pressure coefficient and skin friction coefficient are extracted and plotted in Fig. 13, which agree nicely with the reference data from the second-order finite-volume method [12] and fourth-order DG method [3].

4.3 Shock reflection problem

It is reported that the high-order methods can easily fail to be convergent for flow with shocks [39]. The shock reflection problem is tested to validate the performance of CGKS for solving steady-state problems containing shocks. The geometry is a rectangle with length 4 and height 1. A Cartesian mesh with $120\times 30\times 2$ cells is used. The reflective wall is given on the bottom of the computational domain. The supersonic outflow boundary condition is imposed on the right. The other two sides are Dirichlet conditions:

	$\displaystyle(\rho,U_{1},U_{2},p)$	$\displaystyle=(1.0,2.9,0,1.0/1.4)\|_{\text{left}},$
	$\displaystyle(\rho,U_{1},U_{2},p)$	$\displaystyle=(1.69997,2.61934,-0.50632,1.52819)\|_{\text{up}}.$

A uniform flow with the value of left boundary is set initially. To avoid undesired spurious oscillations around discontinuities, the reconstruction based on the characteristic variables is adopted. Since the numerical convergence of this problem is sensitive to the value of small positive number $\epsilon$ in the compact WENO reconstruction [13], two $\epsilon$ , i.e., $10^{-2}$ and $10^{-6}$ are chosen in the test. Through the CPU time history of the density residuals in Fig. 14, it can be observed that with $\epsilon=10^{-2}$ , both explicit and p-multigrid CGKS can convergence nicely with a residual level less than $10^{-11}$ . However, with $\epsilon=10^{-6}$ , the errors for both schemes settle around $10^{-7}$ . About 3 times of speedup is achieved in this case. On the other hand, there is barely any difference that can be observed in 2-D density contours extracted from the central $x-y$ plane with the same $\epsilon$ , as shown in Fig. 15 and Fig. 16. In addition, with the increasing of $\epsilon$ , the non-linear weights are closer to the linear weights and the numerical solutions near the shock become slightly oscillatory, as shown in Fig. 17.

4.4 Transonic dual NACA0012 airfoils

To further test the convergence performance for the current CGKS with discontinuities and complex geometry, a two-dimensional transonic flow passing through dual NACA0012 airfoils is tested. The head of the first airfoil is located at $(0,0)$ and the second one is located at $(0.5,0.5)$ . Both airfoils are put in parallel with the x-axis. The Reynolds number is Re=500 based on the chord length $L=1$ . The incoming Mach number is Ma=0.8 and the AOA is 10^∘. The airfoil is set as non-slip and adiabatic. Unstructured mixed-element mesh with $14339\times 2$ cells is used with a near wall size $h=2\times 10^{-3}$ . The local enlargements of the mesh and the pressure distributions obtained by the p-multigrid CGKS are presented in Fig. 18. For this case, the residuals of explicit CGKS cannot reach a steady level even with $10^{6}$ steps, as shown in Fig. 19. Thus, the total drag coefficients Cd are used as the convergence criterion. While the p-multigrid CGKS can obtain a converged Cd within 2000 seconds, the explicit counterpart needs more than $2\times 10^{5}$ seconds to achieve a similar result, which means the speed up is more than 100. The Mach distribution and streamline around both airfoils are shown in Fig. 20. The oblique shock wave can be observed at the front of the top airfoil. Two vortices are formed in the separated region of the top airfoil, which agrees well with the referenced results by the second-order finite-volume method [12]. The surface pressure coefficient and skin friction coefficient are also extracted and compared with the reference data [12], as shown in Fig. 21.

4.5 Flow passing through a sphere

Flow passing through a sphere with different Mach number are tested. The Reynolds number is based on the diameter of the sphere $D=1$ . The far-field condition is set at outside boundary of the domain with the free stream condition

\begin{split}(\rho,U,V,W,p)_{\infty}=(1,Ma,0,0,\frac{1}{\gamma}),\end{split}

with $\gamma=1.4$ . The non-slip adiabatic boundary condition is imposed on the sphere. The mesh sample is shown in Fig. 22.

(a) Subsonic case

A low-speed viscous case with Re=118 is presented first. The first mesh off the wall has the size $h\approx 1.0\times 10^{-2}D$ , and the total CELL number is $6144\times 30$ . The residuals can be close to machine zero for both explicit and p-multigrid CGKS. The speedup for the p-multigrid CGKS is about 4.5 at a residual level of $10^{-16}$ , as shown in Fig. 23. The Mach contour and streamline are also presented in Fig. 23 to show the high resolution of the CGKS. Quantitative results are given in Table 2, including the drag coefficient Cd, the separation angle $\theta$ , and the closed wake length $L$ , as defined in [14].

Scheme Mesh number Cd $\theta$ L Cl Experiment [29] – 1.0 151 1.07 – Third-order DDG [7] 160,868 1.016 123.7 0.96 – Fourth-order VFV [31] 458,915 1.014 – – 2.0e-5 Current 184,320 1.002 124.9 0.94 3.5e-3

Table 2: Quantitative comparisons among different compact schemes for the subsonic flow passing through a sphere.

(d) Supersonic case

To validate the effectiveness of the current scheme for the 3-D high-speed viscous flow, a supersonic viscous sphere with Ma=1.2 is tested. The Reynolds number is set as 300 and the Prandtl number is $Pr=1$ . The first mesh off the wall has the size $h\approx 4.5\times 10^{-2}D$ and the total cell number is 50688. Fig. 24 shows the density residuals of different schemes. Similar to the shock reflection problem, the small parameter $\epsilon$ can significantly affect the final convergent residuals. However, similar drag coefficients can be obtained for different $\epsilon$ , with a relative error of about 3%. With the same $\epsilon$ , the almost identical drag coefficients are obtained for both the p-multigrid and explicit CGKS. The p-multigrid CGKS obtains a converged Cd within 200 seconds, where a more than 15 speedup is achieved in comparison with the explicit one. The numerical results including the Mach contour, surface pressure distribution, and streamline around sphere are shown in Fig. 25. Quantitative results are listed in Table 3, which have a good agreement with those given by Nagata et al. [22].

Scheme Mesh Number Cd $\theta$ L Shock stand-off WENO6 [22] 909,072 1.281 150.9 0.38 0.21 Current with $\epsilon=10^{-6}$ 50,688 1.398 148.5 0.45 0.28-0.31 Current with with $\epsilon=10^{-2}$ 50,688 1.354 149.2 0.45 0.28-0.31

Table 3: Quantitative comparisons between the current scheme and the reference solution for the supersonic flow passing through a sphere.

5 Conclusions

In this paper, the p-multigrid method is adopted for the CGKS. A two-level algorithm is used to drive the scheme to the convergent steady-state solutions. In the high-order level, the explicit single-step third-order CGKS is used. In the low-order level, the first-order scheme with implicit point-relaxation method is applied as an iterative smoother. The resulting scheme is simple and low-storage. A series of test cases is presented on 3-D mixed-element mesh. It is worth remarking that the residuals change almost identically along with the iteration steps under either serial or parallel computation, which suggests the current scheme is highly scalable. Efficiency in aspect of CPU time can be increased by one order of magnitude for the p-multigrid CGKS in comparison with the original explicit scheme. The numerical results suggest that higher speedup can be achieved under the mesh with a wide variation of cell size. The algorithm is not sensitive to the use of difference flux solvers on the low-order level. Moreover, the convergence for transonic and supersonic flow is studied. The quantitative results from the p-multigrid method are almost identical to the explicit counterpart. The same order of speedup can be kept for high-speed flow. In further work, the CGKS coupled with turbulent modeling will be developed for the Reynolds-averaged Navier-Stokes solutions.

Acknowledgments

The authors would like to thank Dr. Yajun Zhu and Mr. Yue Zhang for helpful discussions on the p-multigrid method. The current research is supported by National Numerical Windtunnel project, National Science Foundation of China (11772281, 91852114), Hong Kong Research Grant Council (16208021).

References

[1] Antonis F. Antoniadis, Panagiotis Tsoutsanis, and Dimitris Drikakis. Assessment of high-order finite volume methods on unstructured meshes for RANS solutions of aeronautical configurations. Computers & Fluids, 146:86 – 104, 2017.
[2] F. Bassi, A. Ghidoni, S. Rebay, and P. Tesini. High-order accurate p-multigrid discontinuous Galerkin solution of the Euler equations. International Journal for Numerical Methods in Fluids, 60(8):847–865, 2009.
[3] F. Bassi and S. Rebay. A high-order accurate discontinuous finite element method for the numerical solution of the compressible Navier-Stokes equations. Journal of Computational Physics, 131(2):267–279, 1997.
[4] Ram Prakash Bharti, R. P. Chhabra, and V. Eswaran. Steady flow of power law fluids across a circular cylinder. The Canadian Journal of Chemical Engineering, 84(4):406–421, 2006.
[5] Prabhu Lal Bhatnagar, Eugene P Gross, and Max Krook. A model for collision processes in gases. I. Small amplitude processes in charged and neutral one-component systems. Physical Review, 94(3):511, 1954.
[6] M. Braza, P. Chassaing, and H. Ha Minh. Numerical study and physical analysis of the pressure and velocity fields in the near wake of a circular cylinder. Journal of Fluid Mechanics, 165:79–130, 1986.
[7] Jian Cheng, Xiaodong Liu, Tiegang Liu, and Hong Luo. A parallel, high-order direct discontinuous Galerkin method for the Navier-Stokes equations on 3D hybrid grids. Communications in Computational Physics, 21(5):1231–1257, 2017.
[8] Madeleine Coutanceau and Roger Bouard. Experimental determination of the main features of the viscous flow in the wake of a circular cylinder in uniform translation. Part 1. Steady flow. Journal of Fluid Mechanics, 79(2):231–256, 1977.
[9] A. S. Grove, F. H. Shair, and E. E. Petersen. An experimental investigation of the steady separated flow past a circular cylinder. Journal of Fluid Mechanics, 19(1):60–80, 1964.
[10] Florian Hindenlang, Gregor J. Gassner, Christoph Altmann, Andrea Beck, Marc Staudenmaier, and Claus-Dieter Munz. Explicit discontinuous Galerkin methods for unsteady problems. Computers & Fluids, 61:86–93, 2012. ”High Fidelity Flow Simulations” Onera Scientific Day.
[11] Antony Jameson and Seokkwan Yoon. Lower-upper implicit schemes with multiple grids for the Euler equations. AIAA journal, 25(7):929–935, 1987.
[12] P. Jawahar and Hemant Kamath. A high-resolution procedure for Euler and Navier-Stokes computations on unstructured grids. Journal of Computational Physics, 164(1):165–203, 2000.
[13] Xing Ji, Wei Shyy, and Kun Xu. A gradient-compression-based compact high-order gas-kinetic scheme on three-dimensional hybrid unstructured mesh. arXiv preprint arXiv:2107.05169, 2021.
[14] Xing Ji, Fengxiang Zhao, Wei Shyy, and Kun Xu. Compact high-order gas-kinetic scheme for three-dimensional flow simulations. AIAA Journal, 0(0):1–18, 0.
[15] Xing Ji, Fengxiang Zhao, Wei Shyy, and Kun Xu. A HWENO reconstruction based high-order compact gas-kinetic scheme on unstructured mesh. Journal of Computational Physics, page 109367, 2020.
[16] Xing Ji, Fengxiang Zhao, Wei Shyy, and Kun Xu. Two-step multi-resolution reconstruction-based compact gas-kinetic scheme on tetrahedral mesh. arXiv preprint arXiv:2102.01366, 2021.
[17] Ji Li, Chengwen Zhong, Yong Wang, and Congshan Zhuo. Implementation of dual time-stepping strategy of the gas-kinetic scheme for unsteady flow simulations. Physical Review E, 95:053307, 2017.
[18] Jiequan Li and Zhifang Du. A two-stage fourth order time-accurate discretization for Lax–Wendroff type flow solvers I. hyperbolic conservation laws. SIAM Journal on Scientific Computing, 38(5):A3046–A3069, 2016.
[19] C. Liang, R. Kannan, and Z.J. Wang. A p-multigrid spectral difference method with explicit and implicit smoothers on unstructured triangular grids. Computers & Fluids, 38(2):254–265, 2009.
[20] Xiaodong Liu, Lijun Xuan, Yidong Xia, and Hong Luo. A reconstructed discontinuous Galerkin method for the compressible Navier-Stokes equations on three-dimensional hybrid grids. Computers & Fluids, 152:217–230, 2017.
[21] Hong Luo, Joseph D. Baum, and Rainald Löhner. A p-multigrid discontinuous Galerkin method for the Euler equations on unstructured grids. Journal of Computational Physics, 211(2):767–783, 2006.
[22] T. Nagata, T. Nonomura, S. Takahashi, Y. Mizuno, and K. Fukuda. Investigation on subsonic to supersonic flow around a sphere at low Reynolds number of between 50 and 300 by direct numerical simulation. Physics of Fluids, 28(5):056101, 2016.
[23] Liang Pan and Kun Xu. A third-order compact gas-kinetic scheme on unstructured meshes for compressible Navier–Stokes solutions. Journal of Computational Physics, 318:327–348, 2016.
[24] Liang Pan and Kun Xu. High-order gas-kinetic scheme with three-dimensional WENO reconstruction for the Euler and Navier-Stokes solutions. Computers & Fluids, 198:104401, 2020.
[25] Einar M. Rønquist and Anthony T. Patera. Spectral element multigrid. I. Formulation and numerical results. Journal of Scientific Computing, 2(4):389–406, 1987.
[26] David C Seal, Yaman Güçlü, and Andrew J Christlieb. High-order multiderivative time integrators for hyperbolic conservation laws. Journal of Scientific Computing, 60(1):101–140, 2014.
[27] Ziyao Sun, Satoshi Inaba, and Feng Xiao. Boundary variation diminishing (BVD) reconstruction: A new approach to improve Godunov schemes. Journal of Computational Physics, 322:309–325, 2016.
[28] Shuang Tan and Qibing Li. Time-implicit gas-kinetic scheme. Computers & Fluids, 144:44–59, 2017.
[29] Sadatoshi Taneda. Experimental investigation of the wakes behind cylinders and plates at low Reynolds numbers. Journal of the Physical Society of Japan, 11(3):302–307, 1956.
[30] David J Tritton. Experiments on the flow past a circular cylinder at low Reynolds numbers. Journal of Fluid Mechanics, 6(4):547–567, 1959.
[31] Qian Wang. Compact High-Order Finite Volume Method on Unstructured Grids. PhD thesis, Tsinghua University, 6 2017.
[32] Qian Wang, Yu-Xin Ren, Jianhua Pan, and Wanai Li. Compact high order finite volume method on unstructured grids III: Variational reconstruction. Journal of Computational Physics, 337:1–26, 2017.
[33] Zhijian J Wang, Krzysztof Fidkowski, Rémi Abgrall, Francesco Bassi, Doru Caraeni, Andrew Cary, Herman Deconinck, Ralf Hartmann, Koen Hillewaert, Hung T Huynh, et al. High-order CFD methods: current status and perspective. International Journal for Numerical Methods in Fluids, 72(8):811–845, 2013.
[34] Kun Xu. A gas-kinetic BGK scheme for the Navier–Stokes equations and its connection with artificial dissipation and Godunov method. Journal of Computational Physics, 171(1):289–335, 2001.
[35] Kun Xu. Direct Modeling for Computational Fluid Dynamics: Construction and Application of Unified Gas-Kinetic Schemes. World Scientific, 2014.
[36] Xiaoquan Yang, Jian Cheng, Hong Luo, and Qijun Zhao. Robust implicit direct discontinuous Galerkin method for simulating the compressible turbulent flows. AIAA Journal, 57(3):1113–1132, 2019.
[37] Li Yuan. Comparison of implicit multigrid schemes for three-dimensional incompressible flows. Journal of Computational Physics, 177(1):134–155, 2002.
[38] Fan Zhang, Jian Cheng, and Tiegang Liu. A direct discontinuous Galerkin method for the incompressible Navier–Stokes equations on arbitrary grids. Journal of Computational Physics, 380:269–294, 2019.
[39] Shuhai Zhang, Jun Zhu, and Chi-Wang Shu. A brief review on the convergence to steady state solutions of Euler equations with high-order WENO schemes. Advances in Aerodynamics, 1(1):16, 2019.
[40] Fengxiang Zhao, Xing Ji, Wei Shyy, and Kun Xu. Compact higher-order gas-kinetic schemes with spectral-like resolution for compressible flow simulations. Advances in Aerodynamics, 1(1):13, 2019.
[41] Fengxiang Zhao, Xing Ji, Wei Shyy, and Kun Xu. An acoustic and shock wave capturing compact high-order gas-kinetic scheme with spectral-like resolution. International Journal of Computational Fluid Dynamics, pages 1–26, 2020.
[42] Fengxiang Zhao, Xing Ji, Wei Shyy, and Kun Xu. A compact high-order gas-kinetic scheme on unstructured mesh for acoustic and shock wave computations. arXiv preprint arXiv:2010.05717, 2020.
[43] Fengxiang Zhao, Xing Ji, Wei Shyy, and Kun Xu. Direct modeling for computational fluid dynamics and the construction of high-order compact scheme for compressible flow simulations. arXiv preprint arXiv:2107.06555, 2021.
[44] Yajun Zhu, Chengwen Zhong, and Kun Xu. Implicit unified gas-kinetic scheme for steady state solutions in all flow regimes. Journal of Computational Physics, 315:16–38, 2016.