This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Parameter Estimation and Adaptive Solution of the Leray-Burgers Equation using Physics-Informed Neural Networks

Bong-Sik Kim 111Department of Mathematics and Physics, American university of Ras Al Khaimah, UAE. E-mail: bkim@aurak.ac.ae,  Yuncherl Choi 222Ingenium College of Liberal Arts, Kwangwoon University, Seoul 01891, Korea. E-mail : yuncherl@kw.ac.kr, and  Doo Seok Lee 333Department of Undergraduate Studies, Daegu Gyeongbuk Institute of Science and Technology, Daegu 42988, Korea. E-mail : dslee@dgist.ac.kr
(May 7, 2025)
Abstract

This study presents a unified framework that integrates physics-informed neural networks (PINNs) to address both the inverse and forward problems of the one-dimensional Leray-Burgers equation. First, we investigate the inverse problem by empirically determining the characteristic wavelength parameter α\alpha at which the Leray-Burgers solutions closely approximate those of the inviscid Burgers equation. Through PINN-based computational experiments on inviscid Burgers data, we identify a physically consistent range for α\alpha being between 0.01 and 0.05 for continuous initial conditions and between 0.01 and 0.03 for discontinuous profiles, demonstrating the dependence of α\alpha on initial data. Next, we solve the forward problem using a PINN architecture where α\alpha is dynamically optimized during training via a dedicated subnetwork, Alpha2Net. Crucially, Alpha2Net enforces α\alpha to remain within the inverse problem-derived bounds, ensuring physical fidelity while jointly optimizing network parameters (weights and biases). This integrated approach effectively captures complex dynamics, such as shock and rarefaction waves. This study also highlights the effectiveness and efficiency of the Leray-Burgers equation in real practical problems, specifically Traffic State Estimation.

1 Introduction

In this study, we explore the one-dimensional Leray-Burgers (LB) equation (2.1), a regularized model of the inviscid Burgers equation, which introduces a wavelength parameter α\alpha to prevent finite-time blow-ups while aiming to preserve the essential dynamics of the inviscid case, such as shock waves. However, selecting an appropriate α\alpha is a non-trivial task, as its value significantly influences the fidelity of the LB solutions to those of the inviscid Burgers equation. To address this challenge, we employ Physics-Informed Neural Networks (PINNs) [15, 14] in a two-step process that bridges the inverse and forward problems.

First, we tackle the inverse problem, empirically determining the range of α\alpha that aligns LB solutions with inviscid Burgers entropy solutions under various initial conditions (Section 4). We find that the choice of α\alpha depends on the initial data. For continuous initial profiles, the practical range of α\alpha is 0.010.010.050.05, whereas for discontinuous initial profiles, it is 0.010.010.030.03 (4.2). This step establishes a critical foundation by identifying the bounds within which α\alpha yields physically meaningful results.

Next, we turn to the forward inference problem in Section 5 , where PINNs are utilized to solve the LB equation under different initial and boundary conditions. Unlike traditional approaches where α\alpha might be arbitrarily fixed, we treat α\alpha as a learnable parameter depending on tt, optimized during training alongside the standard PINN parameters (weights and biases). This optimization is facilitated by a dedicated subnetwork, Alpha2Net, integrated into the PINN architecture. To ensure that trained α\alpha remains physically relevant, Alpha2Net constrains its values to the range determined from the inverse problem. This constraint links the two problems directly: the inverse problem informs the forward inference by providing a predetermined range that guides the optimization process, ensuring that the resulting solutions are accurate and consistent with the physical insights gained earlier.

We also apply the LB equation to a traffic state estimation to highlight the effectiveness and efficiency of the LB equation in real practical problems in Section 6. We present a brief investigation of two variants of the Lighthill-Whitham-Richards (LWR) traffic flow model, namely LWR-α\alpha (based on the Leray-Burgers equation) and LWR-ϵ\epsilon (based on the viscous Burgers equation), in the context of Traffic State Estimation (TSE). The result demonstrates the efficacy of the LWR-α\alpha model as a suitable alternative for traffic state estimation, outperforming the diffusion-based LWR-ϵ\epsilon model in terms of computational efficiency.

2 Background

2.1 Leray-Burgers Equation

We consider a problem of computing the solution v:[0,T]×Ωv:[0,T]\times\Omega\to\mathbb{R} of an evolution equation

vt(t,x)+𝒩α[v](t,x)=0,\displaystyle v_{t}(t,x)+\mathcal{N}_{\alpha}\,[v](t,x)=0, (t,x)[0,T]×Ω,\displaystyle\quad\forall(t,x)\in[0,T]\times\Omega, (2.1)
v(0,x)=v0(x),\displaystyle v(0,x)=v_{0}(x),\quad xΩ,\displaystyle\quad\forall x\in\Omega,

where 𝒩α\mathcal{N}_{\alpha} is a nonlinear differential operator acting on vv with a small constant parameter α>0\alpha>0,

𝒩α[v]=vvx+α2vxvxx.\mathcal{N}_{\alpha}[v]=vv_{x}+\alpha^{2}v_{x}v_{xx}. (2.2)

Here, the subscripts of vv mean partial derivatives in tt and xx, Ω\Omega\subset\mathbb{R} is a bounded domain, TT denotes the final time and v0:Ωv_{0}:\Omega\to\mathbb{R} is the prescribed initial data. Although the methodology allows for different types of boundary conditions, we restrict our discussion to Dirichlet or periodic cases and prescribe the boundary data as

vb(t,x)=v(t,x),(t,x)[0,T]×Ω,v_{b}(t,x)=v(t,x),\quad\forall(t,x)\in[0,T]\times\partial\Omega,

where Ω\partial\Omega denotes the boundary of the domain Ω\Omega.

Equation (2.1) is called the Leray-Burgers equation (LB). It is also known as Burgers-α\alpha, connectively filtered Burgers equation, Leray regularized reduced order model, etc., in literature. Bhat and Fetecau [2] introduced (2.1) as a regularized approximation to the inviscid Burgers equation

vt+vvx=0.v_{t}+vv_{x}=0. (2.3)

They considered a special smoothing kernel associated with the Green function of the Helmholtz operator

uα=α1v=(Iα2x2)1v,(I=identity),u_{\alpha}=\mathcal{H}^{-1}_{\alpha}v=(I-\alpha^{2}\partial_{x}^{2})^{-1}v,\ \ (I=identity),

where α>0\alpha>0 is interpreted as the characteristic wavelength scale below which the smaller physical phenomena are averaged out (see, for example, [8]) and it accelerates energy decay. Applying the smoothing kernel to the convective term in (2.3) yields

vt+uαvx=0,v_{t}+u_{\alpha}v_{x}=0, (2.4)

where v=v(t,x)v=v(t,x) is a vector field and uαu_{\alpha} is the filtered vector field. The filtered vector uαu_{\alpha} is smoother than vv and the equation (2.4) is a nonlinear Leray-type regularization [12] of the inviscid Burgers equation. Here and in the following, we abuse the notation of the filtered vector uαu_{\alpha} with uu. If we express the equation (2.4) in the filtered vector uu, it becomes a quasilinear evolution equation that consists of the inviscid Burgers equation plus 𝒪(α2)\mathcal{O}(\alpha^{2}) nonlinear terms [2, 3, 4]:

ut+uux=α2(utxx+uuxxx).u_{t}+uu_{x}=\alpha^{2}(u_{txx}+uu_{xxx}). (2.5)

In this paper, we follow Zhao and Mohseni [21] to expand the inverse Helmholz operator in α\alpha to higher orders of the Laplacian operator:

(1α2Δ)1=1+α2Δ+α4Δ2+ifαλmax<1,(1-\alpha^{2}\Delta)^{-1}=1+\alpha^{2}\Delta+\alpha^{4}\Delta^{2}+\cdots\ \mbox{if}\ \alpha\lambda_{\mathrm{max}}<1,

where λmax\lambda_{\mathrm{max}} is the highest eigenvalue of the discretized operator Δ\Delta. Then we can write (2.4) in the unfiltered vector fields vv to obtain the equation (2.1)-(2.2) with 𝒪(α4)\mathcal{O}(\alpha^{4}) truncation error.

For smooth initial data v(0,x)v(0,x) that decrease at least at one point (so there exists yy such that vx(0,y)<0v_{x}(0,y)<0), the classical solution v(t,x)v(t,x) of the inviscid Burgers equation (when α=0\alpha=0) fails to exist beyond a specific finite break time Ts>0T_{s}>0. It is because the characteristics of the inviscid equation intersect in finite time. The Leray-Burgers equation bends the characteristics to make them not intersect each other, avoiding any finite-time intersection and remedying the finite-time breakdown [2, 4]. So, the Leray-Burgers equation possesses a classical solution globally in time for smooth initial data for α>0\alpha>0 [2]:

Theorem 1.

Given initial data v0W2,1()={uL1():DsuL1()forall|s|2}v_{0}\in W^{2,1}(\mathbb{R})=\{u\in L^{1}(\mathbb{R}):D^{s}u\in L^{1}(\mathbb{R})\ \ \mathrm{for\ all}\ \ |s|\leq 2\}, the Leray-Burgers equation (2.4) possesses a unique solution v(t,x)W2,1()v(t,x)\in W^{2,1}(\mathbb{R}) for all t>0t>0.

Furthermore, the Leray-Burgers solution uα(t,x)u_{\alpha}(t,x) with initial data uα(0,x)=α1v0(x)u_{\alpha}(0,x)=\mathcal{H}^{-1}_{\alpha}v_{0}(x) for v0W2,1()v_{0}\in W^{2,1}(\mathbb{R}) converges strongly, as α0+\alpha\rightarrow 0^{+}, to a global weak solution v(t,x)v(t,x) of the following initial value problem for the inviscid Burgers equation (Theorem 2 in [2]):

vt+12(v2)x=0withv(0,x)=v0(x).\displaystyle v_{t}+\frac{1}{2}\left(v^{2}\right)_{x}=0\ \ \ \mbox{with}\ \ v(0,x)=v_{0}(x).

Bhat and Fetecau [2] found numerical evidence that the chosen weak solution in the zero-α\alpha limit satisfies the Oleinik entropy inequality, making the solution physically appropriate. The proof relies on uniform estimates of the unfiltered velocity vv rather than the filtered velocity uu. It made possible the strong convergence of the Leray-Burgers solution to the correct entropy solution of the inviscid Burgers equation. In the context of the filtered velocity uαu_{\alpha}, they also showed that the Leray-Burgers equation captures the correct shock solution of the inviscid Burgers equation for Riemann data consisting of a single decreasing jump [4]. However, since uαu_{\alpha} captures an unphysical solution for Riemann data comprised of a single increasing jump, it was necessary to control the behavior of the regularized equation by introducing an arbitrary mollification of the Riemann data to capture the correct rarefaction solution of the inviscid Burgers equation. With that modification, they extended the existence results to the case of discontinuous initial data uαLu_{\alpha}\in L^{\infty}. However, it is still an open problem for the initial data v0Lv_{0}\in L^{\infty}. In [7], Guelmame, et all, derived a similar regularized equation to (2.5):

ut+uux=α2(utxx+2uxuxx+uuxxx),u_{t}+uu_{x}=\alpha^{2}(u_{txx}+2u_{x}u_{xx}+uu_{xxx}), (2.6)

which has an additional term 2uxuxx2u_{x}u_{xx} on the right-handed side. Notice that uu in this equation is the filtered vector field in (2.4). When they were establishing the existence of the entropy solution, Guelmame, et all. resorted to altering the equation (2.6), as Bhat and Fetecau had to modify the initial data for their proof in [4]. Analysis in the context of the filtered vector field uu appears to induce an additional modification of the equations to achieve the desired results. Working with the actual vector field vv may avoid such arbitrary changes.

Equation (2.4) and related models have previously appeared in the literature. We refer [1, 2, 3, 4, 6, 7, 11, 13, 17, 20, 18] for more properties related to the Leray-Burgers equation. The paper [17] explores the role of α\alpha in regularizing Proper Orthogonal Decomposition (POD)-Galerkin models for the Kuramoto-Sivashinsky (KS) equation. The α\alpha-regularization is introduced to enhance the stability and accuracy of these models by applying Helmholtz filtering to the eigenmodes of the quadratic terms. This filtering controls energy transfer between modes, specifically reducing the impact of high-wavenumber modes that contribute to instability, while preserving the system’s key dynamical features. The link between regularization procedures such as Helmholtz regularization and numerical schemes, for example, had been studied in [6, 13]. They argued that, in numerical computations, the parameter α2\alpha^{2} cannot be interpreted solely as a length scale because it also depends on the numerical discretization scheme chosen. They observed that the choice of α\alpha depends on a relation between α\alpha and the mesh size that preserves stability and consistency with conservation conditions for the chosen numerical scheme [2, 6, 13]. Also, they found that, for a fixed number of grid points, there is a particular value of α0.02\alpha\approx 0.02 below which the solution becomes oscillatory (even with continuous initial profiles).

2.2 Leray Regularization and Conserved Quantities

The Leray-type regularization was first introduced by Jean Leray for the Navier-Stokes equations governing incompressible fluid flow [12]. This regularization scheme enhances stability and accuracy by applying a Helmholtz filter to the convective term, whose effect is evident in the Fourier domain:

1v^=v^1+α2k2.\widehat{\mathcal{H}^{-1}v}=\frac{\widehat{v}}{1+\alpha^{2}k^{2}}.

This filtering mechanism regulates the energy transfer among different modes by attenuating high-wavenumber contributions, which are typically associated with instability. The parameter α\alpha thus serves as a regularization constant that sets the length scale of filtering, suppressing smaller spatial scales while preserving the dominant dynamical behavior of the flow.

In the realm of stochastic partial differential equations, Leray regularization has proven effective in enhancing the stability of reduced order models (ROMs), particularly for convection-dominated systems. Iliescu et al., [11], explored this in their study of a stochastic Burgers equation driven by linear multiplicative noise. They found that standard Galerkin ROMs (G-ROMs) produce spurious numerical oscillations in convection-dominated regimes, a problem exacerbated by increasing noise amplitude. To counter this, they applied an explicit spatial filter to the convective term creating the Leray ROM (L-ROM). This approach significantly mitigates oscillations, yielding more accurate and stable solutions compared to the G-ROM, especially under stochastic perturbations. The L-ROM’s robustness to noise variations suggests that Leray regularization may help preserve statistical properties or conserved quantities, such as energy or moments of the solution, in a stochastic context. This extends the utility of Leray regularization beyond deterministic settings, offering a practical tool for modeling complex stochastic dynamics while maintaining numerical fidelity.

Lemma 2.1.

(Conservations of Energy and Mass) For the Leray-Burgers equation (axb, 0ta\leq x\leq b,\,0\leq t)

vt+(12v2)x+α2vxvxx=0\displaystyle v_{t}+\left(\tfrac{1}{2}v^{2}\right)_{x}+\alpha^{2}v_{x}v_{xx}=0

with periodic boundary condition, both energy and mass are conserved if the regularization parameter α(t,x)\alpha(t,x) is constant with respect to xx. That is, α=α(t)\alpha=\alpha(t).

Proof.

The Leray-Burgers equation can be transformed to the form:

vt+x(12v2+12α2vx2)12(α2)xvx2=0.v_{t}+\frac{\partial}{\partial x}\left(\frac{1}{2}v^{2}+\frac{1}{2}\alpha^{2}v_{x}^{2}\right)-\frac{1}{2}(\alpha^{2})_{x}v_{x}^{2}=0.

To preserve energy conservation, the source term must vanish, i.e.,

12(α2)xvx2=0.\frac{1}{2}(\alpha^{2})_{x}v_{x}^{2}=0.

This condition is satisfied if (α2)x=0(\alpha^{2})_{x}=0, implying that α2\alpha^{2} is constant with respect to xx. Then the source term vanishes, and the equation takes the conservation form:

vt+x(12v2+12α2vx2)=0,v_{t}+\frac{\partial}{\partial x}\left(\frac{1}{2}v^{2}+\frac{1}{2}\alpha^{2}v_{x}^{2}\right)=0,

which ensures energy conservation.

For mass conservation, we want:

ddtabv(t,x)dx\displaystyle\frac{d}{dt}\int_{a}^{b}v(t,x)\,\mathrm{d}x =abvt(t,x)dx\displaystyle=\int_{a}^{b}v_{t}(t,x)\,\mathrm{d}x
=abx(12v2+12α2vx2)dx\displaystyle=-\int_{a}^{b}\frac{\partial}{\partial x}\left(\frac{1}{2}v^{2}+\frac{1}{2}\alpha^{2}v_{x}^{2}\right)\,\mathrm{d}x
=0.\displaystyle=0.

This holds if the Leray-Burgers equation is in pure flux form. So, if α\alpha is constant to xx, the mass conservation also holds. ∎

3 PINN Structure for Inverse and Forward Problems

Refer to caption
Figure 1: The PINN architecture for solving the Leray-Burgers equation: the diagram illustrates the surrogate neural network for predicting v(t,x), the Alpha2Net subnetwork for optimizing α\alpha, and the loss components enforcing the PDE, initial, and boundary conditions.

We employ a Physics-Informed Neural Network (PINN) to address both the inverse and forward problems for the Leray-Burgers (LB) equation, as depicted in Figure 1. The LB equation, given by

vt+vvx+α2vxvxx=0,\displaystyle v_{t}+vv_{x}+\alpha^{2}v_{x}v_{xx}=0,

is solved in various initial and boundary condition scenarios, with the characteristic wavelength parameter α\alpha, fixed (α=constant\alpha=\text{constant}) or adaptively optimized(α=α(t)\alpha=\alpha(t)). The PINN architecture, shown in Figure 1, consists of two primary components: a main neural network to approximate the solution v(t,x)v(t,x) and a subnetwork, Alpha2Net, to learn the parameter α\alpha. The main network is a fully connected feed-forward neural network (multilayer perceptron, MLP) with eight hidden layers, each containing 20 neurons, and employs tanh\tanh activation functions. It takes spatio-temporal coordinates tt and xx as inputs and outputs the predicted solution v(t,x)v(t,x). To enforce the physics of the LB equation, automatic differentiation is used to compute the derivatives v^t\hat{v}_{t}, v^x\hat{v}_{x}, and v^xx\hat{v}_{xx}, which are then used to evaluate the PDE residual.

The Alpha2Net subnetwork, highlighted in the upper part of Figure 1, is designed to adaptively learn the parameter α\alpha as a function of time, i.e. α(t)\alpha(t). This subnetwork is a smaller MLP with three hidden layers, each containing 10 neurons, and also uses tanh\tanh activation functions. It takes the time coordinate tt as its sole input and outputs α2(t)\alpha^{2}(t), for use in the PDE residual. To ensure physical consistency, Alpha2Net constrains α2(t)\alpha^{2}(t) to lie within the range [104,0.01][10^{-4},0.01], slightly broader than the practical range identified in the inverse problem in Section 4. This slight extension allows computational flexibility while maintaining alignment with the physically meaningful bounds established earlier. The constraint is implemented by applying a sigmoid activation at the output layer of Alpha2Net, scaled to map the output to the desired range:

α2(t)=104+(0.01104)sigmoid(z),\alpha^{2}(t)=10^{-4}+(0.01-10^{-4})\cdot\text{sigmoid}(z),

where zz is the raw output of the subnetwork. This ensures that α(t)\alpha(t) remains within the specified bounds during training, preventing the network from converging to nonphysical values.

The output of the main network (v^,v^t,v^x,v^xx)(\hat{v},\hat{v}_{t},\hat{v}_{x},\hat{v}_{xx}) and Alpha2Net(α\alpha) are combined to compute the PDE residual, as shown in the right part of Figure 1. The loss function is designed to enforce the physics of the LB equation and consists of three key components:

  • Residual Loss (enforcing the PDE):

    r=1Nri=1Nr(v^t+v^v^x+α22v^x2v^x2)2,\mathcal{L}_{\text{r}}=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\left(\frac{\partial\hat{v}}{\partial t}+\hat{v}\frac{\partial\hat{v}}{\partial x}+\alpha^{2}\frac{\partial^{2}\hat{v}}{\partial x}\frac{\partial^{2}\hat{v}}{\partial x^{2}}\right)^{2},

    where NrN_{r} is the number of collocation points sampled across the spatiotemporal domain.

  • Initial Condition Loss:

    0=1N0i=1N0(v^(0,xi)v0(xi))2,\mathcal{L}_{\text{0}}=\frac{1}{N_{0}}\sum_{i=1}^{N_{0}}\left(\hat{v}(0,x_{i})-v_{0}(x_{i})\right)^{2},

    where N0N_{0} is the number of points sampled along the initial condition at t=0t=0.

  • Boundary Condition Loss:

    b=1Nbi=1Nb(v^(ti,xb)vb(ti,xb))2,\mathcal{L}_{\text{b}}=\frac{1}{N_{b}}\sum_{i=1}^{N_{b}}\left(\hat{v}(t_{i},x_{\text{b}})-v_{b}(t_{i},x_{\text{b}})\right)^{2},

    where NbN_{b} is the number of points sampled along the boundary x=xbx=x_{b}.

These terms are combined into a total loss

=wrr+w00+wbb,\mathcal{L}=w_{r}\mathcal{L}_{\text{r}}+w_{0}\mathcal{L}_{\text{0}}+w_{b}\mathcal{L}_{\text{b}},

where the weights wr,w0,wbw_{r},w_{0},w_{b} are either fixed (e.g., set to 1 for equal weighting) or tuned dynamically during training to balance the contributions of each loss term. The training points are adaptively sampled using Latin Hypercube Sampling, with a focus on regions exhibiting high PDE residuals or steep solution gradients, such as shock regions near discontinuities in the initial conditions.

The model is optimized using either the ADAM or Limited-Memory BFGS (L-BFGS) optimizer with a decaying learning rate schedule over unit epochs. During training, the parameters of both the main network (weights and biases) and Alpha2Net are updated simultaneously to minimize the total loss \mathcal{L}. Performance is assessed by computing the L2L^{2}-error between the PINN predictions and the analytical solutions of the inviscid Burgers equation, obtained via the method of characteristics.

This architecture, as illustrated in Figure 1, effectively integrates the physical constraints of the LB equation into the neural network framework, allowing for both the approximation of the solution v(t,x)v(t,x) and the adaptive optimization of the regularization parameter α\alpha. The use of Alpha2Net to learn α2(t)\alpha^{2}(t) within a constrained range ensures that the solutions remain physically meaningful, bridging the inverse and forward problems seamlessly.

4 Inverse Problem for the Estimation of Parameter α\alpha

We set up the computational frame for the governing system (2.1) by

vt+λ1vvx+λ2vxvxx=0,t[0,T],xΩ\displaystyle v_{t}+\lambda_{1}vv_{x}+\lambda_{2}v_{x}v_{xx}=0,\ \ t\in[0,T],\ x\in\Omega (4.1)
v(0,x)=f(x),xΩ\displaystyle v(0,x)=f(x),\ x\in\Omega\hskip 36.135pt (4.2)
v(t,x)=g(t,x),t[0,T],xΩ,\displaystyle v(t,x)=g(t,x),\ t\in[0,T],\ x\in\partial\Omega,\hskip 7.22743pt (4.3)

where Ω\Omega\subset\mathbb{R} is a bounded domain, Ω\partial\Omega is a boundary of Ω\Omega, f(x)f(x) is an initial distribution, and g(t,x)g(t,x) is a boundary data. We intentionally introduced a new parameter λ1\lambda_{1} and set λ2=α2\lambda_{2}=\alpha^{2}. During the training process, the PINN will learn λ1\lambda_{1} to determine the validity of the obtained α\alpha for the inviscid Burgers equation (λ1=1\lambda_{1}=1) along with the relative errors. We use numerical or analytical solutions of the exact inviscid and viscous Burgers equations to generate training data sets DD with different initial and boundary conditions:

D={(ti,xi,vi),i=1,,Nd},D=\left\{(t_{i},x_{i},v_{i}),\ i=1,...,N_{d}\right\},

where vi=v(ti,xi)v_{i}=v(t_{i},x_{i}) denotes the output value at position xiΩx_{i}\in\Omega and time 0<tiT0<t_{i}\leq T with the final time TT. NdN_{d} refers to the number of training data. Our goal is to estimate the effective range of α\alpha such that the neural network vθv_{\theta} satisfies the equation (4.1)-(4.3) and vθ(ti,xi)viv_{\theta}(t_{i},x_{i})\approx v_{i}. The selected training models represent a range of initial conditions, from continuous initial data to discontinuous data, displaying shock and rarefaction waves.

4.1 PINN for Inverse Problem

Following the original work of Raissi et al. [15, 14], we use a Physics-Informed Neural Network (PINN) to determine physically meaningful α\alpha-values closely approximating the entropy solutions to the inviscid Burgers equation. For the inverse problem, Alpha2net in Figure 1 is not used because we are looking for a fixed value of α\alpha. The PINN enforces the physical constraint,

(t,x):=vt+λ1vvx+λ2vxvxx\mathcal{F}(t,x):=v_{t}+\lambda_{1}vv_{x}+\lambda_{2}v_{x}v_{xx}

on the MLP surrogate v^(t,x)=vθ(t,x;ξ)\hat{v}(t,x)=v_{\theta}(t,x;\xi), where θ=θ(W,b)\theta=\theta(W,b) denotes all parameters of the network (weights WW and biases bb) and ξ=(λ1,λ2)\xi=(\lambda_{1},\lambda_{2}) the physical parameters in (4.1), acting directly in the loss function

(θ,ξ)=d(θ,ξ)+r(θ,ξ),\mathcal{L}(\theta,\xi)=\mathcal{L}_{d}(\theta,\xi)+\mathcal{L}_{r}(\theta,\xi), (4.4)

where d\mathcal{L}_{d} is the loss function on the available measurement data set that consists in the mean-squared-error (MSE) between the MLP’s predictions and training data and r\mathcal{L}_{r} is the additional residual term quantifying the discrepancy of the neural network surrogate vθv_{\theta} with respect to the underlying differential operator in (4.1). Note that d=w00+wbb\mathcal{L}_{d}=w_{0}\mathcal{L}_{0}+w_{b}\mathcal{L}_{b} with w0=w1=1w_{0}=w_{1}=1 in Figure 1. We define the data residual at (ti,xi,vi)(t_{i},x_{i},v_{i}) in DD:

d,θ(ti,xi;ξ):=vθ(ti,xi;ξ)vi,\mathcal{R}_{d,\theta}(t_{i},x_{i};\xi):=v_{\theta}(t_{i},x_{i};\xi)-v_{i},

and the PDE residual at (ti,xi)(t_{i},x_{i}) in DD:

r,θ(ti,xi;ξ):=tvθ+λ1vθxvθ+λ2xvθxxvθ,\mathcal{R}_{r,\theta}(t_{i},x_{i};\xi):=\partial_{t}v_{\theta}+\lambda_{1}v_{\theta}\partial_{x}v_{\theta}+\lambda_{2}\partial_{x}v_{\theta}\partial_{xx}v_{\theta},

where vθ=vθ(t,x;ξ)v_{\theta}=v_{\theta}(t,x;\xi). Then, the data loss and residual loss functions in (4.4) can be written as

d(θ,ξ)\displaystyle\mathcal{L}_{d}(\theta,\xi) =1Ndi=1Nd|d,θ(ti,xi;ξ)|2,\displaystyle=\frac{1}{N_{d}}\sum_{i=1}^{N_{d}}\Big{|}\mathcal{R}_{d,\theta}(t_{i},x_{i};\xi)\Big{|}^{2},
r(θ,ξ)\displaystyle\mathcal{L}_{r}(\theta,\xi) =1Nri=1Nr|r,θ(ti,xi;ξ)|2.\displaystyle=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\Big{|}\mathcal{R}_{r,\theta}(t_{i},x_{i};\xi)\Big{|}^{2}.

The goal is to find the network and physical parameters θ\theta and ξ\xi that minimize the loss function (4.4):

(θ,ξ)=argminθΘ,ξΞ(θ,ξ)(\theta^{*},\xi^{*})=\underset{\theta\in\Theta,\xi\in\Xi}{\arg\min}\,\mathcal{L}(\theta,\xi)

over an admissible set Θ\Theta and Ξ\Xi of training network parameters θ\theta and ξ\xi, respectively.

In practice, given the set of scattered data vi=v(ti,xi)v_{i}=v(t_{i},x_{i}), the MLP takes the coordinate (ti,xi)(t_{i},x_{i}) as input and produces output vectors vθ(ti,xi;ξ)v_{\theta}(t_{i},x_{i};\xi) that have the same dimension as viv_{i}. The PDE residual r,θ(t,x;ξ)\mathcal{R}_{r,\theta}(t,x;\xi) forces the output vector vθv_{\theta} to comply with the physics imposed by the LB equation. The PDE residual network takes its derivatives with respect to the input variables tt and xx by applying the chain rule to differentiate the compositions of functions using the automatic differentiation integrated into TensorFlow. The residual of the underlying differential equation is evaluated using these gradients. The data loss and the residual loss are trained using input from across the entire domain of interest.

4.2 Experiment 1: Inviscid with Riemann Initial Data

We consider the inviscid Burgers equation (2.3) with some standard Riemann initial data of the form

v0(x)={vL,x0vM,0<x1vR,x>1v_{0}(x)=\left\{\begin{array}[]{cc}v_{L},&x\leq 0\\ v_{M},&0<x\leq 1\\ v_{R},&x>1\end{array}\right.

We used the conservative upwind difference scheme to generate training data. For each initial profile, we computed 256×101=25856256\times 101=25856 data points throughout the entire spatiotemporal domain. We modified the code in [14] and, for each case, performed ten computational simulations with 2000 training data randomly sampled for each computation. We adopted the Limited-Memory BFGS (L-BFGS) optimizer with a learning rate of 0.01 to minimize MSE (4.4). When the L-BFGS optimizer diverged, we preprocessed with the ADAM optimizer and finalized the optimization with the L-BFGS. We manually checked with random sets of hyperparameters by training the algorithm and selected the best set of parameters that fit our objective, 8 hidden layers and 20 units per layer, 1000010000 epochs. We trained the other models with the same parameters, which might not be the best but reasonable fit for them. One remark is that our problem is identifying the model parameter α\alpha rather than inferencing solutions, and it is unnecessary to consider physical causality in our loss function (4.4) as pointed out in [19].

Upon training, the network is calibrated to predict the entire solution v(t,x)v(t,x), as well as the unknown parameters θ\theta and ξ\xi. Along with the relative L2L^{2}-norm of the difference between the exact solution and the corresponding trial solution

Er(=Er(v^)):=vv^2v2,E_{r}(=E_{r}(\hat{v})):=\frac{||v-\hat{v}||_{2}}{||v||_{2}},

we used the absolute error of λ1\lambda_{1},

ϵ(λ1)=|1λ1|\epsilon(\lambda_{1})=|1-\lambda_{1}|

in determining the validity of each computational result.

The practical range of α\alpha was determined by ensuring the relative L2L^{2} error ErE_{r} remained below 10210^{-2} while ϵ(λ1)<0.01\epsilon(\lambda_{1})<0.01, aligning the LB solution with the inviscid Burgers entropy solution. The results show that the α\alpha value depends on the initial data, with the effective range of α\alpha being between 0.01 and 0.05 for continuous initial profiles and between 0.01 and 0.03 for discontinuous initial profiles.

When appropriate, we will also measure the averaged relative L2L^{2} error in time,

E¯r=1T0Tvv^2v2𝑑t.\bar{E}_{r}=\frac{1}{T}\int_{0}^{T}\frac{||v-\hat{v}||_{2}}{||v||_{2}}\,dt.

4.2.1 Shock Waves

We consider two different initial profiles that develop shocks:

vt+vvx=0,x,t[0,1)v_{t}+vv_{x}=0,\ x\in\mathbb{R},\ t\in[0,1)

with the initial data,

(I)v(0,x)={1ifx01xif 0<x<10ifx1\displaystyle\mathrm{(I)}\ \ v(0,x)=\left\{\begin{array}[]{cl}1&\mbox{if}\ x\leq 0\\ 1-x&\mbox{if}\ 0<x<1\\ 0&\mbox{if}\ x\geq 1\end{array}\right.\ \ (4.5)
or\displaystyle\mathrm{or} (II)v(0,x)={1ifx00ifx>0.\displaystyle\mathrm{(II)}\ \ v(0,x)=\left\{\begin{array}[]{cl}1&\mbox{if}\ x\leq 0\\ 0&\mbox{if}\ x>0.\end{array}\right.

The exact entropy solutions corresponding to the initial data (I) and (II) in (4.5) are

(I)v(t,x)={1ifxt1x1tift<x<10ifx1\displaystyle\mathrm{(I^{\prime})}\ \ v(t,x)=\left\{\begin{array}[]{cl}1&\mbox{if}\ x\leq t\\ \frac{1-x}{1-t}&\mbox{if}\ t<x<1\\ 0&\mbox{if}\ x\geq 1\end{array}\right. (4.6)
and\displaystyle\mathrm{and} (II)v(t,x)={1ifxt20ifx>t2,\displaystyle\mathrm{(II^{\prime})}\ \ v(t,x)=\left\{\begin{array}[]{cl}1&\mbox{if}\ x\leq\frac{t}{2}\\ 0&\mbox{if}\ x>\frac{t}{2},\end{array}\right.

respectively. The initial profile (I) in (4.5) represents a ramp function with a slope of 1-1, which creates a wave that travels faster on the left-hand side of xx than on the right-hand side. The faster wave overtakes the slow wave, causing a discontinuity when t=1t=1, as we can see from the exact solution (I)\mathrm{(I^{\prime})} in (4.6). The second initial data (II) in (4.5) contains a discontinuity at x=0x=0. Its solution needs a shock fitting just from the beginning.

No. Initial Profile (I) Initial Profile (II)
λ2=α2\lambda_{2}\!=\!\alpha^{2} ϵ(λ1)\epsilon(\lambda_{1}) ErE_{r} λ2=α2\lambda_{2}\!=\!\alpha^{2} ϵ(λ1)\epsilon(\lambda_{1}) ErE_{r}
1 1.11e-3 3.4e-3 5.73e-3 6.76e-4 6.18e-2 7.16e-3
2 1.28e-3 6.6e-3 5.91e-3 4.46e-4 2.32e-2 3.70e-3
3 1.54e-3 1.24e-2 5.17e-3 4.85e-4 1.70e-2 5.31e-3
4 1.31e-3 1.33e-2 5.62e-3 5.69e-4 7.2e-3 5.92e-3
5 1.47e-3 6.2e-3 5.45e-3 6.53e-4 6.5e-3 7.77e-3
6 1.32e-3 6.1e-3 5.09e-3 8.41e-4 2.04e-2 1.04e-2
7 6.02e-4 1.03e-2 7.02e-3 8.76e-4 5.91e-2 1.09e-2
8 1.94e-3 6.7e-3 5.29e-3 8.77e-4 4.5e-3 1.23e-2
9 7.32e-4 5.6e-3 6.36e-3 9.17e-4 1.51e-2 1.76e-2
10 1.81e-4 1.24e-2 5.33e-3 7.64e-4 4.00e-4 9.83e-3
Avg 1.31e-3 8.30e-3 5.70e-3 7.11e-3 2.12e-2 9.09e-3
Avg\sqrt{\text{Avg}} 3.62e-2 2.67e-2
Table 1: Ten simulation results with Nd=N_{d}= 20002000 training data randomly sampled for each computation.

Based on the Rankine-Hugoniot condition, the discontinuity must travel at a speed x(t)=12x^{\prime}(t)=\frac{1}{2}, which we can observe in the analytical solution (II)\mathrm{(II^{\prime})} in (4.6). The solution also satisfies the entropy condition, which guarantees that it is the unique weak solution for the problem. Table 1 shows ten computational results.

In both cases, the average ϵ(λ1)\epsilon(\lambda_{1}) is within 3×1023\times 10^{-2}, indicating that the inferred PDE residual reflects the actual Leray-Burgers solutions within an acceptable range. The average value of α\alpha with the initial profile (I) was 0.03620.0362 with Er=5.7×103E_{r}=5.7\times 10^{-3}. Figure 2 shows a plot example. We can see that the Leray-Burgers solution captures well the shock wave and maintains the discontinuity at x=1x=1 as tt evolves to 11. Computations with the initial profile (II) resulted in α0.0267\alpha\approx 0.0267 on average with Er=9.1×103E_{r}=9.1\times 10^{-3}. Figure 2 shows that the Leray-Burgers equation captures the shock wave as well as its speed 12\frac{1}{2} per unit time. Increasing the training data (Nd4000N_{d}\geq 4000) did not change the value of α\alpha significantly.

Refer to caption
Figure 2: Top: Evolution with the initial profile (I). Bottom: Evolution with the initial profile (II).

4.2.2 Rarefaction Waves

We generate a training data set from the inviscid Burgers equation

vt+vvx=0,x,t[0,2]v_{t}+vv_{x}=0,\ x\in\mathbb{R},\ t\in[0,2]

with the initial data

(III)v(0,x)={0ifx0xif 0<x<11ifx1\displaystyle\mathrm{(III)}\ \ v(0,x)=\left\{\begin{array}[]{cl}0&\mbox{if}\ x\leq 0\\ x&\mbox{if}\ 0<x<1\\ 1&\mbox{if}\ x\geq 1\end{array}\right.\ \ (4.7)
or\displaystyle\mathrm{or} (IV)v(0,x)={0ifx01ifx>0.\displaystyle\mathrm{(IV)}\ \ v(0,x)=\left\{\begin{array}[]{cl}0&\mbox{if}\ x\leq 0\\ 1&\mbox{if}\ x>0.\end{array}\right.

The rarefaction waves are continuous self-similar solutions, which are

(III)v(t,x)={0ifx0x1+tif 0<x<1+t1ifx1+t\displaystyle\mathrm{(III^{\prime})}\ \ v(t,x)=\left\{\begin{array}[]{cl}0&\mbox{if}\ x\leq 0\\ \frac{x}{1+t}&\mbox{if}\ 0<x<1+t\\ 1&\mbox{if}\ x\geq 1+t\end{array}\right.
and\displaystyle\mathrm{and}\,\, (IV)v(t,x)={0ifx0xtif 0<x<t1ifxt\displaystyle\mathrm{(IV^{\prime})}\ \ v(t,x)=\left\{\begin{array}[]{cl}0&\mbox{if}\ x\leq 0\\ \frac{x}{t}&\mbox{if}\ 0<x<t\\ 1&\mbox{if}\ x\geq t\end{array}\right.

corresponding to the initial data (III) and (IV) in (4.7), respectively.

In both cases, ϵ(λ1)\epsilon(\lambda_{1}) is within 10210^{-2}, indicating that the inferred PDE residual reflects the Leray-Burgers equations within an acceptable range. The average value of α\alpha are 0.04880.0488 with Er=1.99×103E_{r}=1.99\times 10^{-3} for the continuous initial profile (III) and α0.0276\alpha\approx 0.0276 with Er=7.5×103E_{r}=7.5\times 10^{-3} for the discontinuous initial profile (IV). Figure 3 shows that the LB equation captures the rarefaction waves well.

Refer to caption
Figure 3: Top: Evolution with the initial profile (III). Bottom: Evolution with the initial profile (IV).

4.2.3 Shock and Rarefaction Waves

We combine the shock and rarefaction profiles:

(V)v(0,x)={0ifx11+xif1<x01xif 0<x10ifx>1\displaystyle\mathrm{(V)}\ \ v(0,x)=\left\{\begin{array}[]{cc}0&\mbox{if}\ x\leq-1\\ 1+x&\mbox{if}\ -1<x\leq 0\\ 1-x&\mbox{if}\ 0<x\leq 1\\ 0&\mbox{if}\ x>1\end{array}\right.\ \
or\displaystyle\mathrm{or}\,\, (VI)v(0,x)={0ifx<01if 0x10ifx>1\displaystyle\mathrm{(VI)}\ \ v(0,x)=\left\{\begin{array}[]{cc}0&\mbox{if}\ x<0\\ 1&\mbox{if}\ 0\leq x\leq 1\\ 0&\mbox{if}\ x>1\end{array}\right.

In both cases, ϵ(λ1)\epsilon(\lambda_{1}) is within 10210^{-2} and the mean values of α\alpha are 0.03480.0348 with Er=1.95×102E_{r}=1.95\times 10^{-2} for the continuous initial profile (V) and α0.0316\alpha\approx 0.0316 with Er3.17×102E_{r}\approx 3.17\times 10^{-2} for the discontinuous initial profile (VI). Figure 4 shows that the LB equation captures both shock and rarefaction waves well.

Refer to caption
Figure 4: Top: Evolution with the initial profile (V). Bottom: Evolution with the initial profile (VI).

4.3 Experiment 2: Viscid Cases

In this section, we consider the following viscous Burgers equation for a training data set:

vt+vvx=νvxx,(t,x)(0,T]×Ωv_{t}+vv_{x}=\nu v_{xx},\ \ \ \forall(t,x)\in(0,T]\times\Omega (4.8)

with

(A){ν=0.01π,T=1,Ω=[1,1]v(0,x)=sin(πx),xΩv(t,1)=v(t,1)=0,t[0,1]\displaystyle\mathrm{(A)}\ \ \left\{\begin{array}[]{cc}\nu=\frac{0.01}{\pi},\ T=1,\Omega=[-1,1]&\\ v(0,x)=-\sin(\pi x),\ \forall x\in\Omega&\\ v(t,-1)=v(t,1)=0,\ \forall t\in[0,1]\end{array}\right.\hskip 21.68121pt
or\displaystyle\mathrm{or}\,\, (B){ν=0.07,T0.4327,Ω=[0,2π]v(0,x)=2νϕ(x)ϕ(x)+4,xΩϕ(x)=exp(x24ν)+exp((x2π)24ν).\displaystyle\mathrm{(B)}\ \ \left\{\begin{array}[]{cc}\nu=0.07,\ T\approx 0.4327,\ \Omega=[0,2\pi]&\\ v(0,x)=-2\nu\frac{\phi^{\prime}(x)}{\phi(x)}+4,\ \forall x\in\Omega&\\ \phi(x)=\exp\left(\frac{-x^{2}}{4\nu}\right)+\exp\left(\frac{-(x-2\pi)^{2}}{4\nu}\right).\end{array}\right.

The corresponding LB equation is

vt+vvx=α2vxvxx.v_{t}+vv_{x}=-\alpha^{2}v_{x}v_{xx}.

For the initial and boundary data (A), Rudy, et al., [16] proposed the data set that can correctly identify the viscous Burgers equation solely from time series data. It contains 101-time snapshots of a solution to the Burgers equation with a Gaussian initial condition propagating into a traveling wave. Each snapshot has 256 uniform spatial grids. For our experiment, we adopt the data set prepared by Raissi, et all, in [15, 14] based on [16], 101×256=25856101\times 256=25856 data points, generated from the exact solution to (4.8). For training, Nd=2000N_{d}=2000 collocation points are randomly sampled and we use the L-BFGS optimizer with a learning rate of 0.8. The average of ten experiments is α=0.0158\alpha=0.0158 with Er3.8×102E_{r}\approx 3.8\times 10^{-2}. The computational simulation shows that the equation develops a shock properly (Figure 5). Note that ν=0.01/π12.7α2\nu=0.01/\pi\approx 12.7\alpha^{2}.

Refer to caption
Refer to caption
Refer to caption
Figure 5: Example with Nd=2000N_{d}=2000 for the initial profile (A)

For the initial and periodic boundary condition (B), we generate 256×500=128000256\times 500=128000 training data from the exact solution formula for the whole dynamics over time. With Nd=2000N_{d}=2000 training data, the PINN diverges frequently. We experiment with the model with 4000 or more data points to determine an appropriate number of training data. L2L^{2} error remains around 10210^{-2} for all cases, which does not provide a clear cut. So, we use the absolute error of λ1\lambda_{1} to determine the appropriate number of training data. For each case of NdN_{d}, we perform the computation 5 to 10 times (Table 2).

NdN_{d} 4000 6000 8000 10000 12000 14000
ϵ(λ1)\epsilon(\lambda_{1}) 0.0207 0.0113 0.01059 0.00965 0.00908 0.00848
NdN_{d} 16000 18000 20000 25000 30000
ϵ(λ1)\epsilon(\lambda_{1}) 0.0046 0.0059 0.00604 0.00671 0.00611
Table 2: ϵ(λ1)=|1λ1|\epsilon(\lambda_{1})=|1-\lambda_{1}| for various NdN_{d} with the initial profile (B).

As NdN_{d} increases, the results get better and need to do until it reaches the upper limit. Errors between 14000 and 18000 look better than other ranges. More than 18000 does not seem to improve the results. Nd=16000N_{d}=16000 (12.5% of total data) is chosen. More than this does not seem to be better. More likely almost the same. The average of ten computations is α0.0894\alpha\approx 0.0894 with Er=1.64×102E_{r}=1.64\times 10^{-2}. Observe that ν=0.078.8α2\nu=0.07\approx 8.8\alpha^{2}.

Refer to caption
Figure 6: Example with Nd=16000N_{d}=16000 for the initial profile (B).

Every part of the solution for (B) moves to the right at the same speed, which differs from (A) (Fig. 6). In (A), the left side of a peak moves faster than the right side, developing a steeper middle. It resulted in a higher value of α\alpha with (B) than with (A).

In summary, we observe that ν=0.01/π12.7α2\nu=0.01/\pi\approx 12.7\alpha^{2} with the profile (A) and ν=0.078.8α2\nu=0.07\approx 8.8\alpha^{2} with the profile (B). These results demonstrate that the LB equation can capture nonlinear interactions at significantly smaller length scales compared to the viscous Burgers equation. Notably, numerical schemes for the viscous Burgers equation become unstable at lower ν\nu values, whereas the LB equation maintains stability and convergence under these conditions. This observation will be clearer when we compare the forward inferred solutions of two equations in Section 5 (Part C).

4.4 Experiment 3: The Filtered Vector uu

We write Equation (2.1) in the filtered vector uα=uu_{\alpha}=u, which is a quasilinear evolution equation that consists of the inviscid Burgers equation plus 𝒪(α2)\mathcal{O}(\alpha^{2}) nonlinear terms [2, 3, 4]:

ut+uux=α2(utxx+uuxxx).u_{t}+uu_{x}=\alpha^{2}(u_{txx}+uu_{xxx}). (4.9)

We compute the equation with the same conditions as in the previous corresponding experiments. The results show that the filtered equation (4.9) also tends to depend on the continuity of the initial profile as shown in Table 3.

IC Continuous IC Discontinuous
I 0.0279 II 0.0004
III 0.0469 IV 0.0127
V 0.0469 VI 0.0277
Table 3: Averaged α\alpha values for the filtered vector uu. with Nd=2000N_{d}=2000 and epochs =10000=10000 except the case II. (IC = Initial Condition as in Section 4.)

When initial profiles contain discontinuities, the α\alpha values are much smaller than those with continuous initial profiles. Compared to the unfiltered equation (2.1), the α\alpha values for the filtered equation (4.9) are smaller, which may cause more oscillation in forward inference.

With the initial profile (II), the parameter λ1\lambda_{1} for the filtered velocity is not close to 1 with ϵ(λ1)0.1076\epsilon(\lambda_{1})\approx 0.1076 on average. By increasing the number of epochs from 10000 to 50000 we get a better result. λ1\lambda_{1} gets closer to 1 with ϵ(λ1)0.0531\epsilon(\lambda_{1})\approx 0.0531, slightly better relative error and loss, which makes the solution better at later time. The oscillation near the discontinuity gets reduced. This verifies that uu needs very small α\alpha values to approximate the inviscid Burgers solution.

Refer to caption
Figure 7: Examples with Nd=2000N_{d}=2000 for the initial profile (II). Top: epoch=10000.  Bottom: epoch=50000

Having established α\alpha’s practical range, we next explore its application in forward inference.

5 Data-Driven Solutions of the Leray-Burgers Equation

In this section, we solve the LB equation across multiple initial and boundary condition scenarios:

vt+vvx+α2vxvxx=0,x,t(0,1)v_{t}+vv_{x}+\alpha^{2}v_{x}v_{xx}=0,\ x\in\mathbb{R},\ t\in(0,1)

with

(I)v(0,x)={1ifx01xif 0<x<10ifx1\displaystyle\mathrm{(I)}\ \ v(0,x)=\left\{\begin{array}[]{cl}1&\mbox{if}\ x\leq 0\\ 1-x&\mbox{if}\ 0<x<1\\ 0&\mbox{if}\ x\geq 1\end{array}\right.\ \
or\displaystyle\mathrm{or}\,\, (II)v(0,x)={1ifx00ifx>0.\displaystyle\mathrm{(II)}\ \ v(0,x)=\left\{\begin{array}[]{cc}1&\mbox{if}\ x\leq 0\\ 0&\mbox{if}\ x>0.\end{array}\right.

Training utilizes N0=5000N_{0}=5000 initial condition points, Nb=5000N_{b}=5000 boundary condition points, and Nr=20000N_{r}=20000 collocation points. These points are adaptively sampled using Latin Hypercube Sampling, with an emphasis on regions exhibiting high PDE residuals or steep solution gradients, particularly in shock regions near discontinuities identified in the initial condition. Our computational focuses are as follows:

  1. 1.

    Convergence in α\alpha. Whether the PINN solutions converge to those of the inviscid Burgers equation as α0+\alpha\rightarrow 0^{+}.

  2. 2.

    Forward inference with adaptive α(t)\alpha(t). Whether the PINN solutions capture the shock and rarefaction waves well and whether the trained α\alpha values are within the physically valid range.

  3. 3.

    Scaling effect of the α\alpha parameter relative to the inviscid and viscous Burgers equation.

5.1 The Convergence of the Leray-Burgers Solutions as α0+\alpha\rightarrow 0^{+}

Figure 8 demonstrates that the Leray-Burgers equation effectively captures the shock formation with the continuous initial profile (I) within the range of 0<α<0.050<\alpha<0.05. As α0+\alpha\rightarrow 0^{+}, the LB solution converges to the inviscid Burgers solution (the last graph in Figure 8).

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: PINN solution with the initial profile (I) with N0=1000N_{0}=1000, Nb=1000N_{b}=1000, and Nr=10000N_{r}=10000 with epochs = 20000. α=0.05,0.04,0.03,0.01,0.0\alpha=0.05,0.04,0.03,0.01,0.0 from the top with α=0\alpha=0 (inviscid Burgers). The corresponding errors E¯r=1.1×102,7.9×103,5.9×103,2.6×103\bar{E}_{r}=1.1\times 10^{-2},7.9\times 10^{-3},5.9\times 10^{-3},2.6\times 10^{-3}, and 3.2×1033.2\times 10^{-3} for α=0\alpha=0, respectively.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 9: PINN solution with the initial profile (II) with N0=1000N_{0}=1000, Nb=1000N_{b}=1000, and Nr=10000N_{r}=10000 with epochs = 20000. α=0.03,0.025,0.015,0.01,0.0\alpha=0.03,0.025,0.015,0.01,0.0 from the top with α=0\alpha=0 (inviscid Burgers). The corresponding errors E¯r=4.7×102,4.4×102,3.4×102,9.4×102\bar{E}_{r}=4.7\times 10^{-2},4.4\times 10^{-2},3.4\times 10^{-2},9.4\times 10^{-2}, and 10.0×10210.0\times 10^{-2} for α=0\alpha=0, respectively.

With the discontinuous initial file (II), the Leray-Burgers equation still accurately captures the shock formation within the range of 0.01<α<0.030.01<\alpha<0.03 (Figure 9). However, the MLP-based PINN generates spurious oscillations near the discontinuity at the beginning. Although the network quickly recovers and fits the oscillations as time progresses, the oscillations worsen, and nonlinear instability arises as the α\alpha scale becomes smaller than 0.01, which leads to the deviation of the network solution from the actual inviscid Burgers solution (the last graph in Figure 9).

5.2 Forward Inference with Adaptively Optimized α>0\alpha>0

In this section, we employ the MLP-based Physics-Informed Neural Network (MLP-PINN) to effectively learn the nonlinear operator 𝒩α[v]\mathcal{N}_{\alpha}[v], wherein vv represents the primary variable and α\alpha denotes a parameter. Coutinho, et all., [5], introduced the idea of adaptive artificial viscosity that can be learned during the training procedure and does not depend on the a priori choice of artificial viscosity coefficient. Instead of incorporating the parameter α\alpha in place of the artificial viscosity as in [5], we set up a dedicated subnetwork, Alpha2Net depicted in Figure 1, to find the optimal α(t)\alpha(t) value. The integration of the subnetwork into the main PINN architecture makes PINN train both vv and α\alpha to achieve a robust fit with the LB equation. Two examples highlight the ability of the LB equation to capture shock and rarefaction waves as well as the corresponding optimal values of α\alpha, which are presented in Figure 10.

Refer to caption
Refer to caption
Figure 10: Optimal values: (II) α=0.0169\alpha=0.0169 with E¯r=3.5×102\bar{E}_{r}=3.5\times 10^{-2}(top) and (IV) α=0.0032\alpha=0.0032 with E¯r=7.9×103\bar{E}_{r}=7.9\times 10^{-3} (bottom).

For the computations, we generated 100×1000=100000100\times 1000=100000 training data in the domain [0,2]×[2,4][0,2]\times[-2,4] from the corresponding analytical solution for each case. With N0=Nb=1000N_{0}=N_{b}=1000, Nr=10000N_{r}=10000, and epochs =20000. The first graph presents computational snapshots of the system’s evolution with the initial profile (II) over the time interval [0, 2]. The computational outputs are 4.1×104\mathcal{L}\approx 4.1\times 10^{-4} and the averaged relative L2L^{2}-error in time is around 3.5×1023.5\times 10^{-2} with α0.0169\alpha\approx 0.0169. Note that α\alpha is the average of trained values of α(t)\alpha(t) over time. The second graph illustrates snapshots of the evolution of a rarefaction wave with the initial profile (IV). The computational outputs are 1.9×104\mathcal{L}\approx 1.9\times 10^{-4} and the averaged relative L2L^{2} error in time is around 7.9×1037.9\times 10^{-3} with the averaged α=0.0032\alpha=0.0032 in time.

5.3 The Effect of α\alpha Scale in Relation to the Inviscid and Viscous Burgers Equations

When comparing the Leray-Burgers equation (2.1) with the viscous Burgers equation (4.8), the term α2vxvxx\alpha^{2}v_{x}v_{xx} in (2.1) serves as a nonlinear regularization mechanism, acting as a substitute for the linear diffusion term in the viscous Burgers equation. Unlike linear diffusion, the α\alpha term in Equation (2.1) depends on both the first derivative vxv_{x} and the second derivative vxxv_{xx}, suggesting that its smoothing effect is more pronounced in regions with high gradients, modulated by the parameter α\alpha. Thus, it is valuable to assess the performance of these two equations in relation to the inviscid Burgers equation.

Both equations are solved using PINNs with consistent training configurations: 20,000 epochs, fixed weights, and identical network architectures (8 hidden layers, 20 neurons per layer). The key metric for comparison is the L2L^{2} error, which quantifies the difference between the predicted and exact solutions, with lower values indicating better accuracy. Computations provide L2L^{2} errors for both equations across different values of α\alpha, with ν\nu set equal to α2\alpha^{2} in the viscous Burgers equation (4.8). The averaged L2L^{2} errors over time are summarized in Table 4.

α\alpha ν=α2\nu=\alpha^{2} LB Average L2L^{2} VB Average L2L^{2}
0.025 0.000625 2.8025×1022.8025\times 10^{-2} 9.1788×1029.1788\times 10^{-2}
0.030 0.0009 2.9933×1022.9933\times 10^{-2} 1.2258×1011.2258\times 10^{-1}
0.032 0.001024 3.2280×1023.2280\times 10^{-2} 7.8998×1017.8998\times 10^{-1}
0.033 0.001089 3.1334×1023.1334\times 10^{-2} 8.6701×1028.6701\times 10^{-2}
0.035 0.001225 3.2484×1023.2484\times 10^{-2} 1.3176×1021.3176\times 10^{-2}
Table 4: Averaged L2L^{2} errors over time for Leray-Burgers (LB) and viscous Burgers (VB) equations.

For α\alpha values ranging from 0.025 to 0.033 (ν\nu from 0.000625 to 0.001089), the LB equation consistently outperforms the viscous Burgers equation in the averaged L2L^{2} error. The averaged L2L^{2} error for LB equation remains relatively stable, ranging from 2.8025×1022.8025\times 10^{-2} to 3.2280×1023.2280\times 10^{-2}. In contrast, the Burgers equation exhibits higher errors, ranging from 7.8998×1027.8998\times 10^{-2} to 1.2258×1011.2258\times 10^{-1}. There is no clear monotonic trend, indicating variability in the neural network’s ability to approximate the solution.

These results indicate that for small values of ν\nu, the viscous Burgers equation is prone to developing shocks due to its hyperbolic nature. PINNs may struggle to accurately capture these discontinuities. In contrast, the LB equation, through its nonlinear regularization effect (dependent on α\alpha), likely smooths these discontinuities, leading to improved accuracy.

The data also suggest a tipping point between α=0.032\alpha=0.032 (ν=0.001024\nu=0.001024), where the performance of the two models being to shift. A more definitive transition appears to occur between α=0.033\alpha=0.033 and α=0.035\alpha=0.035 (ν=0.001225\nu=0.001225), at which point the viscous Burgers equation begins to outperform the LB equation. This transition is illustrated in Figure 11.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 11: Top two rows: Leray-Burgers equation with α=0.032\alpha=0.032 and 0.0350.035 with E¯r=5.1×1002\bar{E}_{r}=5.1\times 10^{-02} and 4.8×10024.8\times 10^{-02}, respectively. Bottom two rows: viscous Burgers equation with ν=0.001024\nu=0.001024 and 0.0012250.001225 with E¯r=1.3×1001\bar{E}_{r}=1.3\times 10^{-01} and 3.9×10023.9\times 10^{-02}, respectively.

To further examine the differences in solution behavior, Figure 12 presents heatmaps of the difference between the solutions of the LB equation and the inviscid Burgers equation, as well as the difference between the viscous Burgers equation and the inviscid Burgers equation, for α=0.025\alpha=0.025 (ν=0.000625\nu=0.000625). The LB equation exhibits a more gradual transition in error distribution, while the viscous Burgers equation shows sharper localized discrepancies along the shock region. This suggests that the nonlinear regularization in LB equation helps mitigate sharp discontinuities, leading to improved prediction accuracy.

Refer to caption
Refer to caption
Figure 12: Heatmaps of the difference between exact and predicted solutions for α=0.025\alpha=0.025. Top: Leray-Burgers vs. Inviscid Burgers. Bottom: Viscous Burgers vs. Inviscid Burgers.

In summary, the parameter α\alpha (through ν=α2\nu=\alpha^{2}) controls the regularization strength. Smaller values of α\alpha correspond to finer scales where regularization enhances accuracy, while larger values increase ν\nu, potentially leading to over-smoothing compared to the standard viscous Burgers equation. In practice, the LB equation may be preferable for smaller length scales (low α\alpha), while the viscous Burgers equation may be more suitable for larger scales (higher α\alpha). The transition appears to occur near α=0.035\alpha=0.035. These findings emphasize the interplay between physical regularization, viscosity, and the numerical approximation capabilities of PINNs.

6 Application to Traffic State Estimation

This section demonstrates the practical utility of forward inference with the LB equation, using the estimated α\alpha range to model traffic dynamics efficiently. Huang, et all., [9] applied PINNs to tackle the challenge of data sparsity and sensor noise in traffic state estimation (TSE). The main goal of TSE is to obtain and provide a reliable description of traffic conditions in real time. In Case Study-I in [9], they prepared the test bed of a 5000-meter road segment for 300 seconds ((t,x)[0,300]×[0,5000])\left((t,x)\in[0,300]\times[0,5000]\right). The spatial resolution of the dataset is 5 meters and the temporal resolution is 1 second. The case study was designed to utilize the trajectory information data from Connected and Autonomous Vehicles (CAVs) as captured by Roadside Units (RSUs), which were deployed every 1000 meters on the road segment (6 RSUs on the 5000-meter road from x=0x=0). The communication range of RSU was assumed to be 300 meters, meaning that vehicle information broadcast by CAVs at x[0,300]x\in[0,300] can be captured by the first RSU and the second RSU can log CAV data transmitted at x[700,1300]x\in[700,1300], etc. More details on data acquisition and description can be found in [9, 10].

In this section, we switch to the differential notation dρdt,ρx,2ρx2\frac{d\rho}{dt},\frac{\partial\rho}{\partial x},\frac{\partial^{2}\rho}{\partial x^{2}} to avoid confusion with constant parameter notations such as ρm\rho_{m}. Let q(t,x)q(t,x) denote the flow rate indicating the number of vehicles that pass a set location in a unit of time and ρ(t,x)\rho(t,x) the flow density representing the number of vehicles in a unit road of space. Then, the Lighthill-Whitham-Richards (LWR) traffic model [9] is, for (t,x)+×(t,x)\in\mathbb{R}^{+}\times\mathbb{R},

ρ(t,x)t+q(t,x)x=0,\frac{\partial\rho(t,x)}{\partial t}+\frac{\partial q(t,x)}{\partial x}=0, (6.1)

where ρ(t,x)=N(t,x)x\rho(t,x)=-\frac{\partial N(t,x)}{\partial x} and q(t,x)=N(t,x)tq(t,x)=\frac{\partial N(t,x)}{\partial t}. Here N(t,x)N(t,x) is the cumulative flow that depicts the number of vehicles that have passed location xx by time tt. Huang, et all., [9] adopted the Greenshields fundamental diagram to set the relationship between traffic states - density ρ\rho, flow qq, and speed vv:

q(ρ)\displaystyle q(\rho) =ρvf(1ρρm)\displaystyle=\rho v_{f}\left(1-\frac{\rho}{\rho_{m}}\right) (6.2)
v(ρ)\displaystyle v(\rho) =vf(1ρρm),\displaystyle=v_{f}\left(1-\frac{\rho}{\rho_{m}}\right),

where ρm\rho_{m} is the jam density (maximum density) and vfv_{f} is the free-flow speed. Substituting the relationship (6.2) into (6.1) transforms the LWR model into the LWR-Greenshield model

vf(12ρ(t,x)ρm)ρ(t,x)x+ρ(t,x)t=0.v_{f}\left(1-\frac{2\rho(t,x)}{\rho_{m}}\right)\frac{\partial\rho(t,x)}{\partial x}+\frac{\partial\rho(t,x)}{\partial t}=0. (6.3)

We will just call it the LWR model. The equation (6.3) is a hyperbolic PDE and a second order diffusive term can be added as following, to make the PDE become parabolic and secure a strong solution:

vf(12ρ(t,x)ρm)ρ(t,x)x+ρ(t,x)t=ϵ2ρx2.v_{f}\left(1-\frac{2\rho(t,x)}{\rho_{m}}\right)\frac{\partial\rho(t,x)}{\partial x}+\frac{\partial\rho(t,x)}{\partial t}=\epsilon\frac{\partial^{2}\rho}{\partial x^{2}}. (6.4)

We will call the equation (6.4) the LWR-ϵ\epsilon model. The second-order diffusion term ensures that the solution of PDE is continuous and differentiable, avoiding breakdown and discontinuity in the solution. Following the same structural idea from (6.3) to (6.4) we add a regularization term to (6.3) instead of the diffusion term in (6.4):

vf(12ρ(t,x)ρm)ρ(t,x)x+ρ(t,x)t=α2ρx2ρx2.v_{f}\!\left(1\!-\!\frac{2\rho(t,x)}{\rho_{m}}\right)\frac{\partial\rho(t,x)}{\partial x}+\frac{\partial\rho(t,x)}{\partial t}=-\alpha^{2}\frac{\partial\rho}{\partial x}\frac{\partial^{2}\rho}{\partial x^{2}}. (6.5)

We will call Equation (6.5) the LWR-α\alpha model. We set up the same PINN architecture for computational comparisons of three models, LWR, LWR-ϵ\epsilon, and LWR-α\alpha, with vf=25m/s,ρm=0.15vehicles/mv_{f}=25\,m/s,\rho_{m}=0.15\,vehicles/m. Figure 13 visualizes the computational results.

Refer to caption
Figure 13: Traffic State Estimation: Reference Speed vv, LWR-α\alpha Estimate (α=0.025\alpha=0.025), LWR-ϵ\epsilon (ϵ=0.025\epsilon=0.025), LWR Estimate, respectively counter-clockwise from the top left; Relative error 101\approx 10^{-1}; LWR-α\alpha training time is a little slower than LWR but much faster than LWR-ϵ\epsilon.

Our empirical calculations demonstrate that both the LWR-α\alpha and LWR-ϵ\epsilon models provide reasonable approximations of the reference speed (vv), as illustrated in Figure 13. Both models exhibit comparable accuracy to the standard LWR model, validating their potential for traffic state estimation applications.

The experiments highlight the critical role of nonlinear characteristics in traffic data for accurate state estimation. The LWR-α\alpha model emerges as the more practical choice due to its superior ability to capture the inherent nonlinear behavior of traffic flow. While the LWR-ϵ\epsilon model offers a reasonable approximation, its limited computational performance at lower scales (below 0.025) restricts its utility in real-time applications.

In our traffic state estimation (TSE) application, we employed the LWR-α\alpha model with α=0.025\alpha=0.025. This parameter value directly addresses our primary objectives of determining the practical range of α\alpha in the Leray-Burgers equation and evaluating the effectiveness of Physics-Informed Neural Networks (PINNs) in solving the forward inference problem. The chosen value of α=0.025\alpha=0.025 aligns closely with the estimated range of the inverse problem: specifically 0.01 to 0.05 for continuous initial profiles and 0.01 to 0.03 for discontinuous profiles.

The successful application of the LWR-α\alpha model in accurately capturing the dynamics of traffic flow validates the physical relevance and practicality of our estimated α\alpha range. Moreover, the robust performance of PINNs in precisely estimating traffic states using this model demonstrates their effectiveness in solving the forward inference problem for the Leray-Burgers equation. Consequently, our TSE results substantiate both the accuracy of our α\alpha estimation and the capabilities of PINNs, thereby reinforcing the core findings of our study and affirming their potential in real-world applications.

7 Discussion

The relationship between the inverse and forward problems is a cornerstone of our approach to solving the Leray-Burgers (LB) equation with Physics-Informed Neural Networks (PINNs). In the inverse problem, we determine the practical range of the characteristic wavelength parameter α\alpha that ensures that the LB equation closely approximates the inviscid Burgers solution. This range, derived from the training of PINNs on inviscid Burgers data, reflects the values of α\alpha that maintain the physical fidelity of LB solutions under a variety of initial conditions.

This estimation is not an isolated step, but directly informs the forward inference process. When training PINNs to solve the LB equation, we do not prescribe a fixed α\alpha. Instead, α\alpha is treated as a trainable parameter, optimized concurrently with the standard PINN parameters (weights and biases) through a subnetwork called Alpha2Net. To ensure that the optimized α\alpha remains physically meaningful, Alpha2Net enforces a constraint: α\alpha must be within the range established by the inverse problem. This restriction serves a dual purpose: it prevents the network from converging to nonphysical or suboptimal values of α\alpha, and it leverages the prior knowledge gained from the inverse problem to improve the accuracy and stability of the forward solutions. For example, if the inverse problem indicates that α\alpha should range between 0.01 and 0.05 for certain profiles, Alpha2Net ensures that the α\alpha learned during forward inference adheres to these bounds. This linkage guarantees that the solutions to the LB equation not only capture complex phenomena like shocks and rarefactions but also remain consistent with the physical constraints established earlier. Thus, the inverse problem provides an essential scaffold that supports and refines forward inference, creating a unified framework for parameter estimation and PDE solution.

8 Conclusion

Computational experiments show that the α\alpha-values depend on the initial data. Specifically, the practical range of α\alpha spans from 0.01 to 0.05 for continuous initial profiles and narrows to 0.01 to 0.03 for discontinuous profiles. We also note that the Leray-Burgers equation in terms of the filtered vector uu does not produce reliable estimates of α\alpha. When approximating the filtered solution uu with commendable precision, MLP-PINN necessitates a more extensive dataset, and the range of α\alpha values for uu appears confined, between 0.0001 and 0.005. Nonetheless, the MLP-PINN’s attempts with uu encounter challenges in converging to the true Burgers solutions. Thus, it is evident that the equation formulated in the unfiltered vector field vv offers a better approximation to the exact Burgers equation.

In practical terms, treating α\alpha as an unknown variable becomes a prudent strategy. By endowment α\alpha with learnable attributes along with network parameters, MLP-PINNs can be structured to reveal α\alpha within a valid range during the training process, potentially improving accuracy. Nevertheless, the MLP-PINN does generate spurious oscillations near discontinuities inherent in shock-inducing initial profiles. This phenomenon thwarts the PINN solution from aligning with an exact inviscid Burgers solution as α0+\alpha\rightarrow 0^{+}.

This study also demonstrates the effectiveness of the LWR-α\alpha model as a viable alternative for traffic state estimation. Surpassing the diffusion-based LWR-ϵ\epsilon model in terms of computational efficiency, the LWR-α\alpha model aligns with the nonlinear nature of traffic data.

Statements and Declarations

Competing Interests: On behalf of all authors, the corresponding author states that there is no conflict of interest.

Data Availability: The code and data we used to train and evaluate our models are available at

https://github.com/bkimo/PINN-LB.

The data for traffic state estimations generated by Huang, et al. [9, 10] is available at

https://github.com/arjhuang/pise.

Acknowledgment

The second author was supported by the Research Grant of Kwangwoon University in 2022 and by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. 2021R1F1A1058696). The third author gratefully acknowledge the Advanced Technology and Artificial Intelligence Center at American University of Ras Al Khaimah for providing high-performance GPU computing resources to support this research.

References

  • [1] Raul K.C. Araújo1, Enrique Fernández-Cara, and Diego A. Souza. On the uniform controllability for a family of non-viscous and viscous burgers-α\alpha systems. ESAIM: Control, Optimisation and Calculus of Variations, 27(78), 2021.
  • [2] H. S. Bhat and R. C. Fetecau. A hamiltonian regularization of the burgers equation. Journal of Nonlinear Science, 16:615–638, 2006.
  • [3] H. S. Bhat and R. C. Fetecau. Stability of fronts for a regularization of the burgers equation. Quarterly of Applied Mathematics, 66:473–496, 2008.
  • [4] H. S. Bhat and R. C. Fetecau. The Riemann problem for the leray-burgers equation. Journal of Differential Equations, 246:3597–3979, 2009.
  • [5] Emilio Jose Rocha Coutinho, Marcelo Dall’Aqua, Levi McClenny, Ming Zhong, Ulisses Braga-Neto, and Eduardo Gildin. Physics-informed neural networks with adaptive localized artificial viscosity. Journal of Computational Physics, 489(112265), 2023.
  • [6] Georg A Gottwald. Dispersive regularizations and numerical discretizations for the inviscid burgers equation. Journal of Physics A: Mathematical and Theoretical, 40(49), 2007.
  • [7] Billel Guelmame, Stéphane Junca, Didier Clamond, and Robert Pego. Global weak solutions of a hamiltonian regularised burgers equation. Journal of Dynamics and Differential Equations, 2022.
  • [8] Darryl D. Holm, Chris Jeffery, Susan Kurien, Daniel Livescu, Mark A. Taylor, and Beth A. Wingate. The lans-α\alpha model for computing turbulence: Origins, results, and open problems. Los Alamos Science, 19, 2005.
  • [9] Archie J. Huang and Shaurya Agarwal. Physics-informed deep learning for traffic state estimation: Illustrations with LWR and CTM models. IEEE Open Journal of Intelligent Transportation Systems, 3, 2022.
  • [10] Archie J. Huang and Shaurya Agarwal. On the limitations of physics-informed deep learning: Illustrations using first-order hyperbolic conservation law-based traffic flow model. IEEE Open Journal of Intelligent Transportation Systems, 4, 2023.
  • [11] Traian Iliescu, Honghu Liu, and Xuping Xie. Regularized reduced order models for a stochastic burgers equation. International Journal of Numerical Analysis and Modeling, 15(4-5):594–607, 2018.
  • [12] J. Leray. Essai sur le mouvement d’un fluide visqueux emplissant l’space. Acta Math., 63:193–248, 1934.
  • [13] Yekaterina S. Pavlova. Convergence of the leray α\alpha-regularization scheme for discontinuous entropy solutions of the inviscid burgers equation. The UCI Undergraduate Research Journal, pages 27–42, 2006.
  • [14] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics informed deep learning (part ii): Data-driven discovery of nonlinear partial differential equations. 2017. Preprint at https://arxiv.org/abs/1711.10566.
  • [15] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, pages 686–707, 2019.
  • [16] S.H. Rudy, S.L. Brunton, J.L. Proctor, and J.N. Kutz. Data-driven discovery of partial differential equations. Science Advances, 3, 2017.
  • [17] Feriedoun Sabetghadam and Alireza Jafarpour. α\alpha regularization of the pod-galerkin dynamical systems of the kuramoto–sivashinsky equation. Applied Mathematics and Computation, 218:6012–6025, 2012.
  • [18] John Villavert and Kamran Mohseni. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
  • [19] Sifan Wang, Parsi Perdikaris, and Shyam Sankaran. Respecting causality for training physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 421(116813), 2024.
  • [20] Ting Zhang and Chun Shen. Regularization of the shock wave solution to the riemann problem for the relativistic burgers equation. Abstract and Applied Analysis, 2014, 2014.
  • [21] Hongwu Zhao and Kamran Mohseni. A dynamic model for the lagrangian-averaged navier-stokes-α\alpha equations. Physics of Fluids, 17(075106), 2005.