KDE Based Coarse–graining of Semicrystalline Systems with Correlated Three–body Intramolecular Interaction

Jianlan Ye [ Vipin Agrawal Yiyang Li [ Minghao Liu [ Jing Hu [ Jay Oswald joswald1@asu.edu [

Abstract

We present an extension to the iterative Boltzmann inversion method to generate coarse–grained models with three–body intramolecular potentials that can reproduce correlations in structural distribution functions. The coarse–grained structural distribution functions are computed using kernel density estimates to produce analytically differentiable distribution functions with controllable smoothening via the kernel bandwidth parameters. Bicubic interpolation is used to accurately interpolate the three–body potentials trained by the method. To demonstrate this new approach, a coarse–grained model of polyethylene is constructed in which each bead represents an ethylene monomer. The resulting model reproduces the radial density function as well as the joint probability distribution of bond–length and bond–angles sampled from target atomistic simulations with only a 10% increase in the computational cost compared to models with independent bond–length and bond–angle potentials. Analysis of the predicted crystallization kinetics of the model developed by the new approach reveals that the bandwidth parameters can be tuned to accelerate the modeling of polymer crystallization. Specifically, computing target RDF with larger bandwidth slows down the secondary crystallization and increasing the bandwidth in $\theta$ –direction of bond–length and bond–angle distribution reduces the primary crystallization rate.

\urlstyle

same Arizona State University] School for the Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, 85287, United States Arizona State University] School for the Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, 85287, United States Arizona State University] School for the Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, 85287, United States Arizona State University] School for the Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, 85287, United States Arizona State University] School for the Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, 85287, United States

1 Introduction

Coarse–grained (CG) molecular dynamics (MD) simulations are commonly used to study polymer physics for their ability in probing molecular processes at substantially longer time and length scales than all–atom models.^{1, 2, 3, 4} While including more atoms in a bead reduces the computational cost of the model, the reduced resolution weakens its ability to accurately represent the structural properties of the underlying polymer. For example, Salero et al.⁵ found that polyethylene (PE) models with flexible chains did not crystallize at all when the coarse–grained beads represent more than four methylene groups. They attributed the phenomenon to the exceedingly large equilibrium bond–length compared with the equilibrium pair distance.^{6, 7} At the lowest degree of coarse–graining, united–atom model (UAM) that lumps hydrogen atoms with their bonded carbon atoms along the chain backbone are more capable at reproducing structural features. The UAM of polyethylene, initially developed by Paul et al.⁸ and subsequently modified by Waheed et al.,^{9, 10} yields a hexagonal crystal structure.¹¹ The UAM developed by Zubova et al. reproduced the expected orthorhombic crystal structure by adjusting the displacements of beads from the chain axes and the van der Waals radii.^{12, 13} However, given that united atom models have a relatively small reduction in the degrees of freedom compared to all–atom models, the range of time–scales the models can access remains highly limited. Therefore, many MD studies of polyethylene crystallization^{9, 14, 15, 16, 17, 18, 19} that employed UAMs used systems with chain lengths no longer than 1000 methyl units because the drastically reduced crystallization rate with longer chains necessitates unfeasible simulation time, although real polyethylene chains are much longer. Thus, developing CG models with efficient computation and accurate representation of structure can strengthen the capability of MD simulations in the investigations that require large systems and long time scales.

It has been shown that the accuracy of CG models in representing water,²⁰ RNA,²¹ and proteins^{22, 23} increase when considering multi–body interactions. Larini et al.²⁰ introduced three–body non–bonded interactions into their single–site water CG model by adding the relative angular orientation of the triplets of water molecules into the energy formulation; the addition greatly improved the model’s ability to reproduce the radial distribution function (RDF) and angular distribution function (ADF). Ejtehadi et al.²⁴ and Krishna et al.²⁵ showed that three–body interactions play an important role in accurately representing the protein folding mechanism and the structure of HIV-1 capsid, respectively. Also, Schaposchnikov et al.²⁶ showed that the contribution of three–body effect is 20% to 40% of ligand–capped nanocrystals. Multibody interactions can also improve the representability of polymer structures. Fukunaga et al.²⁷ presented a CG model that groups three CH₂ units into a CG bead and showed that the bond–lengths and torsion angles distributions strongly correlated with the bond–angle distributions at room temperature.

CG model interactions, e.g. non–bonded interactions, can be described either by a functional form or interpolated from tabulated values. Although, the tabulated potentials are more flexible in describing complex structures, they are also more prone to sampling noise. The noise–induced energetic fluctuations causes magnified oscillations when computing the forces, leading to a reduced stable time step. When training CG models with a structure–based method such as iterative Boltzmann inversion (IBI),²⁸ the structural distribution functions, e.g. RDF, are commonly constructed using histograms, which has well–known disadvantages such as limited smoothness with a small sample size and sensitivity to the bin size. Even though smoothing methods such as B-spline²⁷ and cubic spline²⁰ can be used to smooth the potential surface parametrized using histograms, McCabe et al.²⁹ showed that with the kernel density estimation (KDE), we can readily obtain analytically differentiable and smooth structural distribution functions regardless of sample size.

In this paper, we develop an approach for constructing CG models with three–body interactions to reproduce correlated bond–length and bond–angle distribution functions. The resulting potentials are generated on a two–dimensional grid and evaluated using bicubic interpolation. To generate accurate and smooth energy derivatives, the KDE is used to calculate the structural distribution functions that enables analytical differentiation of energy. We demonstrate the new method by developing coarse–grained models for PE, a model semicrystalline polymer. The models are named as CG–BA model. We also study how the KDE bandwidths affect the activation energy of diffusion, computational performance, and the crystallization kinetics of the resulting CG models.

2 Methodology

2.1 CG model

Refer to caption — Figure 1: Schematic demonstrating how (a) underlying trans and gauche torsional conformations at the atomistic representation lead to (b) multiple bond–lengths and bond–angles at the coarse–grained representation, which (c) manifests as multi–peaked coarse–grained bond and angle distributions.

Similar to the CG models developed by Li et al., ³⁰ we use a coarse–grained mapping scheme in which each CG interaction site, hereafter referred to as a bead, represents an ethylene unit and each bead center is defined as the midpoint of the C–C bond. Beads are considered bonded if they represent adjacent monomers on a molecular chain. Due to the relatively fine scale of the mapping scheme and the predominant trans and gauche torsion conformations in PE, the coarse–grained bond–length distribution (BDF) is double–peaked, as shown in Figure 1, leading to correlations between the BDF and ADF of the CG model.

The CG models were developed using the IBI method to produce tabulated CG potentials. The total potential energy of the CG system includes the summation of a non–bonded potential and a bond–angle energy potential, written as:

U_{total}=U_{nb}+U_{ba}.

(1)

The non–bonded energy function is described by tabular values with a cutoff distance $r_{c}=16$ Å, and applies to all pairs of atoms (pair potential), except contributions from the first–, second–, and third–bonded neighbors. The bond–angle energy term (BA potential) includes three–body energy contributions from all sequential bead triplets along each chain and is also described by tabular values, written as:

U_{ba}=\sum_{ijk}^{N_{\theta}}s_{i}\mathcal{V}(l_{ij},\theta_{ijk})+s_{k}\mathcal{V}(l_{jk},\theta_{ijk})

(2)

where $s_{i}=1$ if bead $i$ is an end bead and $s_{i}=\frac{1}{2}$ if bead $i$ is an internal bead within the chain. The energy contribution from internal beads is halved because the bonds in a linear chain are shared by adjacent angles. It should be emphasized that for the CG model to be able to reproduce correlations between the bond–length and bond–angle distributions, the contributions of the bond–length and bond–angle energies cannot be additively decomposed, necessitating a multivariate potential energy function. The bond–angle potential as defined in Eq. (2) is implemented as a user angle style within the LAMMPS molecular dynamics code and is available along with the tabulated potentials on GitHub.³¹

2.2 KDE–based probability distribution

Following the work by McCabe²⁹, we computed the KDE of the RDF and the BADF. The RDF is computed on a one–dimensional grid using the following:

g(r_{k})=\frac{\frac{1}{N}\sum_{i}^{N}\sum_{j\in\mathcal{N}_{i}}K_{r}(r_{k}-r_{ij})\Delta r}{\frac{4\pi}{3}\bar{\rho}\left((r_{k}+\frac{1}{2}\Delta r)^{3}-(r_{k}-\frac{1}{2}\Delta r)^{3}\right)},

(3)

where $N$ is the number of beads, $\Delta r$ is the grid spacing, $r_{ij}$ is the distance between beads $i$ and $j$ , $\mathcal{N}_{i}$ is the set of beads within the cutoff distance of bead $i$ , $r_{k}$ is the grid point that starts at $\frac{\Delta r}{2}$ , and $\bar{\rho}$ is the overall number density of beads in the system. A Gaussian kernel function $K_{r}$ distributes the contribution of each pair distance to the grid value using a bandwidth parameter $w_{r}$ :

K_{r}(r)=\frac{1}{w_{r}\sqrt{2\pi}}\,\exp\left(-\frac{r^{2}}{2w_{r}^{2}}\right)

(4)

The numerator of (3) is similar to a histogram of the number of beads located within spherical shells emanating out from a central bead, except that the counting of each bead is smeared across multiple grid points by the kernel function, defined in Eq. (4). The denominator produces the standard normalization of the RDF.

The joint bond–length and bond–angle probability density distribution function (BADF) is computed using a multidimensional KDE. We introduced two measures for the probability function, $P(l,\theta)$ is the true probability density function while $\widehat{P}(l,\theta)$ is the entropy–scaled probability.

	$\displaystyle P(l,\theta)=\frac{1}{N}\sum\limits_{I=1}^{N}K_{H}(\mathbf{x}-{{\mathbf{x}}_{I1}})+K_{H}(\mathbf{x}-{{\mathbf{x}}_{I2}})$		(5)
	$\displaystyle\widehat{P}(l,\theta)=\frac{1}{N}\sum\limits_{I=1}^{N}\frac{K_{H}(\mathbf{x}-{{\mathbf{x}}_{I1}})+K_{H}(\mathbf{x}-{{\mathbf{x}}_{I2}})}{\sin\theta_{I}}$		(6)

where $\mathbf{x}_{I1}=(l_{I1},\theta_{I})$ and $\mathbf{x}_{I2}=(l_{I2},\theta_{I})$ are split from triplet $(l_{I1},l_{I2},\theta_{I})$ sampled from the systems. This decomposition requires $l_{I1}$ and $l_{I2}$ to be uncorrelated. The Spearman correlation coefficient of the CG bond lengths is 0.027, indicating that the assumption is valid for the coarse–grained model considered here. The kernel function used for computing the BADF, denoted by $K_{H}$ , is a two–dimensional Gaussian function, defined as:

K_{H}(\mathbf{x})=\frac{1}{2\pi{\left|\mathbf{H}\right|}^{1/2}}\exp\left(-\frac{\mathbf{x}^{T}\cdot\mathbf{H}^{-1}\cdot\mathbf{x}}{2}\right),

(7)

where $\mathbf{H}$ is a matrix,

\mathbf{H}=\left[\begin{matrix}w_{l}^{2}&0\\ 0&w_{\theta}^{2}\end{matrix}\right],

(8)

in which, $w_{l}$ and $w_{\theta}$ are the bond–length and bond–angle bandwidth parameters, respectively.

Since the Jacobian of the transformation from Cartesian to polar coordinates is proportional to $\sin\theta$ , the probability distribution $P(l,\theta)$ vanishes at $\theta=\pi$ , which introduces numerical artifacts to the IBI steps that worsen with increasing $w_{\theta}$ . An entropy correction factor $\sin\theta_{I}$ is introduced to the kernel function to account for this entropic effect as shown in Eq. (6). The computed values of the entropy–scaled BADF spuriously decrease as $\theta$ approaches $\pi$ . To mitigate this issue, mirrored data points, i.e., $(l_{I},2\pi-\theta_{I})$ , were added for $(l_{I},\theta_{I})$ where $\theta_{I}>\pi-5w_{\theta}$ . The entropy–scaled BADF will be used in IBI updates.

Due to the differentiability of the KDE probability distribution functions, we can compute the derivatives of the potential energy directly from the derivatives of the probability distributions. The derivatives of the bond–angle distributions are:

$\displaystyle\partialderivative{\widehat{P}}{l}$	$\displaystyle=-\frac{1}{N}\sum\limits_{I=1}^{N}\frac{K_{H}(\mathbf{x}-{{\mathbf{x}}_{I1}})}{\sin\theta_{I}}\left(\frac{l-l_{I1}}{w_{l}^{2}}\right)+\frac{K_{H}(\mathbf{x}-{{\mathbf{x}}_{I2}})}{\sin\theta_{I}}\left(\frac{l-l_{I2}}{w_{l}^{2}}\right)$	(9)
$\displaystyle\partialderivative{\widehat{P}}{\theta}$	$\displaystyle=-\frac{1}{N}\sum\limits_{I=1}^{N}\frac{K_{H}(\mathbf{x}-\mathbf{x}_{I1})}{\sin\theta_{I}}\left(\frac{\theta-\theta_{I}}{w_{\theta}^{2}}\right)+\frac{K_{H}(\mathbf{x}-\mathbf{x}_{I2})}{\sin\theta_{I}}\left(\frac{\theta-\theta_{I}}{w_{\theta}^{2}}\right)$	(10)
$\displaystyle\begin{split}\partialderivative{\widehat{P}}{\theta}{l}&=-\frac{1}{N}\sum\limits_{I=1}^{N}\frac{K_{H}(\mathbf{x}-\mathbf{x}_{I1})}{\sin\theta_{I}}\left(\frac{l-l_{I1}}{w_{l}^{2}}\right)\left(\frac{\theta-\theta_{I}}{w_{\theta}^{2}}\right)\\ &\hskip 45.0pt+\frac{K_{H}(\mathbf{x}-\mathbf{x}_{I2})}{\sin\theta_{I}}\left(\frac{l-l_{I2}}{w_{l}^{2}}\right)\left(\frac{\theta-\theta_{I}}{w_{\theta}^{2}}\right)\end{split}$		(11)

Eqs. (9–11) are computationally expensive and computational cost scales with the total number of grid points multiplied by the number of sampled angle triplets. While KDEs can be accelerated using fast Fourier transforms, it can be accomplished more straightforwardly on graphics processors. The code that calculated the KDE–based BADF is available on Github.³²

2.3 IBI training

A set of 15 CG systems were generated by placing 25 chains of 80 beads, where each bead represents a C₂H₄ monomer, in a periodic simulation box via a random walk. The simulation box size was selected so that the density $\rho$ = 0.785 g/cm³, which is consistent with the reported density of amorphous PE.^{33, 34} Next, the systems were relaxed using the HB potential developed by Li et al.³⁰, in which a short 2–ps NVE run was performed using a displacement–limiting integrator and temperature rescaling to 600 K to eliminate nearly overlapping beads introduced by the random walk, followed by a 50–ns simulation performed in the isotropic isothermal–isobaric (NPT) ensemble at 600 K and 1 atm to equilibrate the system in the melt. We then ramped the system temperature down to 500 K over 25 ns, which was followed by another 50–ns equilibration at 500 K in the isotropic NPT ensemble.

We subsequently reverse–mapped the systems into atomistic resolution following the method used in the previous work.³⁵ The atomistic interactions for PE were modeled using the polymer consistent force field (PCFF).³⁶ Then, we equilibrated the systems in the anisotropic NPT ensemble for 1 ns at 500 K and ramped the system temperature down to 300 K over 2 ns. Finally, the systems were equilibrated at 300 K and 1 atm for 2 more nanoseconds in the anisotropic NPT ensemble. These 15 equilibrated atomistic systems were used as target systems. All MD simulations in this work were performed using the Large–scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). ³⁷

The average density of the equilibrated target systems is 0.758 $\pm$ 0.002 g/cm³ on a 95% confidence interval. To verify that the systems remained amorphous, we calculated their crystallinity using the local $p_{2}$ order parameter, the definition of which is adopted from the work by Yi et al.¹¹, written as $p_{2,i}=\left<\frac{3\cos^{2}\theta_{ij}-1}{2}\right>_{j}$ . The final crystallinity was 0.0019 $\pm$ 0.0023. After equilibration, the trajectories of the target atomistic systems were sampled each picosecond over a 1–ns NVT simulation performed at 300 K. The target distribution functions were then calculated using KDE as described previously. The grid size for the RDF was 0.01 Å, while the grid size for the entropy–scaled BADF was 0.008 Å and 0.013 rad, for the bond–length and bond–angle, respectively.

To reduce the equilibration time needed in IBI iterations, we pre–equilibrated the CG systems at 300 K with HB potential to minimize the structural difference between CG systems and target atomistic systems. After being equilibrated at 500 K, the CG systems were then quenched to 300 K over 1 ns under isotropic NPT ensemble to minimize crystallization, which was followed by another 1–ns 300 K equilibration in the isotropic NPT ensemble. These 15 pre–equilibrated CG systems are used in the following IBI iterations.

The potentials of mean force were used as initial guesses for the CG potential functions by computing Boltzmann inversions of the structural distribution functions. Using the KDE representation, we directly calculate the initial potentials and their derivatives:

$\displaystyle U_{0}(r)$	$\displaystyle=-k_{B}T\ln g_{*}$	(12)
$\displaystyle U_{0}(l,\theta)$	$\displaystyle=-k_{B}T\ln\widehat{P}_{*}$	(13)
$\displaystyle\partialderivative{U_{0}}{l}(l,\theta)$	$\displaystyle=-\frac{k_{B}T}{\widehat{P}_{}}\partialderivative{\widehat{P}_{}}{l}$	(14)
$\displaystyle\partialderivative{U_{0}}{\theta}(l,\theta)$	$\displaystyle=-\frac{k_{B}T}{\widehat{P}_{}}\partialderivative{\widehat{P}_{}}{\theta}$	(15)
$\displaystyle\partialderivative{U_{0}}{l}{\theta}(l,\theta)$	$\displaystyle=-\frac{k_{B}T}{\widehat{P}_{}}\left(\partialderivative{\widehat{P}_{}}{l}{\theta}-\frac{1}{\widehat{P}_{}}\partialderivative{\widehat{P}_{}}{l}\partialderivative{\widehat{P}_{*}}{\theta}\right)$	(16)

where $k_{B}$ is the Boltzmann constant, $g_{*}$ and $\widehat{P}_{*}$ represent the RDF and entropy–scaled BADF of the target systems, respectively. The accurate analytical energy derivatives allow bicubic interpolation to interpolate $\mathcal{V}(l,\theta)$ and its derivatives in the $l-\theta$ space. The bicubic interpolation provides $C^{1}$ continuity in the energy potential.

Two cases require correction of unphysical artifacts in the pair potentials. The first case arises as $g(r)\rightarrow 0$ at small values of $r$ due to the excluded volume around each bead, leading to an undefined pair energy, which we replace with a short–range interaction of the form, $ar^{-9}+br+c$ . The short–range interaction is applied for $r<r_{0}$ , the poorly sampled region where less than a thousand pairs were observed, equivalent to $g(r)<1e-4$ in our systems. The parameters, $a$ , $b$ , and $c$ are then computed to enforce $C^{2}$ continuity in $U(r)$ at $r_{0}$ . The second case arises when the bandwidth, $w_{r}$ , is large compared with the grid spacing $\Delta r$ , which results in a spurious peak in the pair distribution function at $r=0$ that is caused when the kernel function decays to zero more slowly than the $r_{k}^{2}$ in the kernel estimate of the RDF given in Eq. (3). To resolve the issue, we used a similar extrapolation method as the first case with $r_{0}$ chosen so that $U(r_{0})=\mathrm{max}(U)/2$ , as shown in Figure 2.

Low–probability regions, where less than fifty data points were observed or $\widehat{P}(l_{k},\theta_{k})<10^{-5}(\mathrm{rad}\cdot\mathrm{\AA})^{-1}$ , in the BADF also cause numerical difficulties when computing the potential energy and its updates. Zero or extremely small values in the BADF lead to either undefined values or erratic energy updates that can slow or preclude the convergence of the CG potential. Therefore, we extrapolate the bond–angle energy in the low–probability regions. Grid points $k$ , such that $\widehat{P}(l_{k},\theta_{k})<10^{-5}(\mathrm{rad}\cdot\mathrm{\AA})^{-1}$ , are placed in a set $\mathcal{T}$ and all remaining grid points are placed in a set $\mathcal{A}$ . Next, grid point $k$ with the largest number of known neighboring points in set $\mathcal{A}$ is selected. The neighbors of point $k$ are defined as all points $j$ such that $|l_{k}-l_{j}|<\delta_{l}$ and $|\theta_{k}-\theta_{j}|<\delta_{\theta}$ . The energy and derivatives at point $k$ , are then recomputed using biquadratic extrapolation from the known neighboring points and point $k$ is moved to set $\mathcal{A}$ . The patch size of the neighboring points was selected as $\delta_{l}=0.192$ Å and $\delta_{\theta}=0.176$ rad. This procedure is repeated until set $\mathcal{T}$ is empty.

After generating initial potentials, we sampled the RDF and entropy–scaled BADF over 20 ps after a 20–ps equilibration. We subsequently calculated $g(r)$ and $\widehat{P}(l,\theta)$ defined in Eq. (3–7), which are then used to update the initial guesses of the potential.

Similar to the initial guesses, the energy updates are calculated with the target and current iteration distribution functions and their derivatives:

$\displaystyle\Delta U(r)$	$\displaystyle=-\alpha{k_{B}}T\ln\frac{g_{*}}{g_{i}}$	(17)
$\displaystyle\Delta U(l,\theta)$	$\displaystyle=-\gamma{k_{B}}T\ln\frac{\widehat{P}_{*}}{\widehat{P}_{i}}$	(18)
$\displaystyle\partialderivative{\Delta U}{l}(l,\theta)$	$\displaystyle=-\gamma{k_{B}}T\left(\partialderivative{\widehat{P}_{}}{l}\frac{1}{\widehat{P}_{}}-\partialderivative{\widehat{P}_{i}}{l}\frac{1}{\widehat{P}_{i}}\right)$	(19)
$\displaystyle\partialderivative{\Delta U}{\theta}(l,\theta)$	$\displaystyle=-\gamma{k_{B}}T\left(\partialderivative{\widehat{P}_{}}{\theta}\frac{1}{\widehat{P}_{}}-\partialderivative{\widehat{P}_{i}}{\theta}\frac{1}{\widehat{P}_{i}}\right)$	(20)
$\displaystyle\begin{split}\partialderivative{\Delta U}{l}{\theta}(l,\theta)&=\gamma{k_{B}}T\Bigg{[}\frac{1}{{\widehat{P}_{}}^{2}}\left(\partialderivative{\widehat{P}_{}}{\theta}\partialderivative{\widehat{P}_{}}{l}-\partialderivative{\widehat{P}_{}}{l}{\theta}\widehat{P}_{*}\right)\\ &\hskip 40.0pt-\frac{1}{{\widehat{P}_{i}}^{2}}\left(\partialderivative{\widehat{P}_{i}}{\theta}\partialderivative{\widehat{P}_{i}}{l}-\partialderivative{\widehat{P}_{i}}{l}{\theta}\widehat{P}_{i}\right)\Bigg{]}\end{split}$		(21)

where the coefficients $\alpha$ and $\gamma$ are scaling factors for $U(r)$ and $U(l,\theta)$ , respectively, we took values of 0.15 and 0.1 for $\alpha$ and $\gamma$ to facilitate convergence. The potential values are evaluated at the same grid points as their corresponding distribution functions. Similar to the initial pair potentials, unphysical artifacts can appear in the updated pair potentials as well, which are corrected by the same approach.

Following the energy updates, we iteratively adjusted the updated pair potentials to ensure that the resulting pressure does not deviate far from atmospheric pressure with the method used by Wang et al.³⁸

\displaystyle\Delta V_{pc}(r)=A\left(1-\frac{r}{r_{c}}\right)

(22)

where $r_{c}$ is the cutoff distance of non–bonded interaction, and coefficient A can be calculated as:

\displaystyle-\left[\frac{2\pi N\rho}{3r_{c}}\int_{0}^{r_{c}}r^{3}g(r)dr\right]A\approx\Delta PV

(23)

After the pressure correction, we conduct the sampling process mentioned earlier for error evaluation. In each iteration, after the distribution functions sampled from the updated potentials, we evaluate the representation error and the sampling error in each iteration in the same way as reported by Liu and Oswald.³ The representation error of the RDF at iteration $i$ is defined as:

\displaystyle\epsilon_{r}^{(i)}=\frac{||g_{i}(r)-g_{*}(r)||_{2}}{||g_{*}(r)||_{2}}

(24)

where $||g_{i}(r)||$ denotes the $L_{2}$ norm of $g_{i}(r)$ . The sampling error is defined as:

\displaystyle\epsilon_{s}^{(i)}=\frac{\sqrt{(||w_{i}(r)||_{2})^{2}+(||w_{*}(r)||_{2})^{2}}}{||g_{*}(r)||_{2}}

(25)

where $w_{i}(r)$ and $w_{*}(r)$ are the confidence interval width of $g_{i}(r)$ , defined as:

\displaystyle w(r)=t^{*}\frac{\sigma(r)}{\sqrt{n}}

(26)

where $t^{*}$ is the critical value for the $t$ -distribution with a 95% confidence interval and $\sigma(r)$ is the sample standard deviation, and $n$ is the number of systems at current iteration. The error of $P(l,\theta)$ is evaluated in the same way. We continue to the energy update of the next iteration if $\epsilon_{r}>\epsilon_{s}$ , or increase the number of included CG systems if otherwise. The trainings are considered to be converged if the maximum number of CG systems is reached. For the trainings of the current work, we started with 5 CG systems and the maximum number was 15.

3 Results and discussion

To determine appropriate baseline KDE bandwidth parameters, we first compare the representation and sampling errors of the target structural distributions sampled from atomistic simulations. The representation error measures the loss of detail incurred by increasing the KDE bandwidth and is computed as the $L_{2}$ norm of the difference between the distributions calculated at a particular bandwidth and the reference bandwidth. The sampling error measures the average deviation of the distributions sampled across different systems. The definitions of these errors for the radial distribution functions are:

	$\displaystyle\epsilon_{r}=\frac{\left\\|g_{w_{r}}-g_{0}\right\\|_{2}}{\left\\|g_{0}\right\\|_{2}},$		(27)
	$\displaystyle\epsilon_{s}=\frac{\left\\|\sigma_{w_{r}}\right\\|_{2}}{\left\\|g_{w_{r}}\right\\|_{2}},$		(28)

where $g_{w_{r}}$ is the RDF calculated using bandwidth $w_{r}$ , $g_{0}$ is the RDF calculated at the reference bandwidth $w_{r_{0}}=0.01$ Å, and $\sigma_{w_{r}}$ is the point–wise standard deviation of $g_{w_{r}}$ calculated from the 15 different target atomistic systems. The representation and sampling errors of the bond–angle probability distributions are calculated as:

	$\displaystyle\epsilon_{r}=\left\\|P_{H}-P_{H_{0}}\right\\|_{2}$		(29)
	$\displaystyle\epsilon_{s}=\int{\sigma_{H}dld\theta},$		(30)

where $H$ represents the bandwidth matrix, defined by $w_{l}$ and $w_{\theta}$ , and $\sigma_{H}$ denotes the standard deviation of the probability distributions computed using a given bandwidth matrix. The bond–length and bond–angle bandwidths used for the reference bandwidth matrix are $w_{l_{0}}=0.002$ Å and $w_{\theta_{0}}=0.0031~{}\mathrm{rad}$ , respectively. As shown in Figure 3, increasing the kernel bandwidth parameters reduces the sampling errors but results in substantially larger increases in the representation errors. The baseline bandwidth parameters were chosen where the representation and sampling errors have equal magnitudes: $w_{r}^{*}=0.07$ Å $w_{l}^{*}=0.016$ Å, and $w_{\theta}^{*}=0.021~{}\mathrm{rad}$ .

Subsequently, we conducted IBI training varying different bandwidths individually in the target distributions and studied the effect of each bandwidth on the resulting potentials. For $w_{r}$ study, we had 4 trainings using different $w_{r}$ with constant $w_{l}$ and $w_{\theta}$ , e.g., $(nw^{*}_{r},w^{*}_{l},w^{*}_{\theta})$ where n = 1, 2, 4, and 8. Studies of $w_{l}$ and $w_{\theta}$ were conducted similarly. Therefore, there were 10 training conducted in total since $(w^{*}_{r},w^{*}_{l},w^{*}_{\theta})$ is shared in all three studies.

It is important to note that we used the baseline bandwidth set $(w^{*}_{r},w^{*}_{l},w^{*}_{\theta})$ in the CG simulations of all the trainings while changing the bandwidths of target distributions. If the bandwidths used in training are the same as the ones in the target distributions, the smoothing of the CG and target distributions will negate each other and the CG models would be trained to reproduce the same target structure, i.e., all the converged potentials will produce almost the same distribution functions if smoothed with the same bandwidths. Also, training with large bandwidths will have uniqueness issues; i.e., a set of strongly smoothed target distributions can be produced by more than one set of potentials due to the over–smoothed distribution functions of CG systems.

The trained CG models successfully reproduced the target distributions in all cases. A comparison between the target and CG distribution function for the $(w^{*}_{r},w^{*}_{l},w^{*}_{\theta})$ case is shown in Figure 4 as an example. In addition, the pair potentials of different cases are presented in Figure 5(a). The bond–angle potential of the case $(w^{*}_{r},w^{*}_{l},w^{*}_{\theta})$ is presented in Figure 5(b) as an example. It can observed that there are four major potential wells corresponding to the 4 peaks shown in Figure 4(c), representing different local conformations of PE chains. Expectation projection of bond–angle potentials in $l$ - and $\theta$ -direction, as defined in the caption, are presented in Figure 5(c) and (d), respectively. It is shown that as larger bandwidths are used, the BA potential would have less prominent peaks and crests, indicating that the energy barriers between the conformations are smaller. It is not recommended to use $w_{l}>4w_{l}^{*}$ or $w_{\theta}>4w_{\theta}^{*}$ if accurate representation of local conformation is needed, since some gauche conformations would not be produced due to disappeared corresponding energy wells.

Following the method of Li et al. ³⁰, the influence of the bandwidth parameters on the stable integration time step was determined by finding the largest time step for which the energy drift over a 1–ps NVE simulation was less than 1% of the initial kinetic energy of the system. Table 1 compares the performance of CG models of different bandwidth cases. It can be observed that the potentials trained in larger bandwidth cases can tolerate larger critical time steps and hence have better computational performance. It is verified that the highest–frequency mode is the bond vibration, since the critical time step significantly increases when the $l$ -direction stiffness is reduced by using target distributions with larger $w_{l}$ . From the perspective of time step–per–second, our CG–BA model has a performance of 29.78 time step/s with an 8000–monomer system and one MPI task, which is about 91% of the performance of the HB model. The slowdown is due to more computationally expensive bicubic interpolation compared to the simple linear interpolation used in the HB model and the cost induced by extra MPI gathering and communication of bond information.

$w_{r}$	$w_{l}$	$w_{\theta}$	$\Delta t_{c}$	CPU	Speed
(Å)	(Å)	(rad)	(fs)	(s)	up
0.07	0.016	0.021	4	1.05	$\times$ 47
0.14	0.016	0.021	6	0.70	$\times$ 70
0.28	0.016	0.021	4	1.05	$\times$ 47
0.56	0.016	0.021	6	0.70	$\times$ 71
0.07	0.032	0.021	4	1.05	$\times$ 47
0.07	0.064	0.021	7	0.60	$\times$ 83
0.07	0.128	0.021	15	0.28	$\times$ 177
0.07	0.016	0.042	4	1.05	$\times$ 47
0.07	0.016	0.084	6	0.70	$\times$ 71
0.07	0.016	0.168	6	0.70	$\times$ 71
HB			5	0.76	$\times$ 65
all-atom (PCFF)			1	49.5	–

Table 1: Performance comparison of the models calibrated using different bandwidth parameters, where

\Delta t_{c}

is the critical time step, CPU is the wall time required per monomer per nanosecond for an NPT simulation performed on a single core on an Intel (R) Xeon (R) E5-2680 v4, and the speed-up factor is the relative performance to an all–atom simulation.

To understand the effect of bandwidths on the crystalline phase chain diffusion behavior of semicrystalline polymers, we calculated the activation energy of the potentials trained with different targets. We constructed a crystal system in a periodic simulation cell with x, y, and z dimensions of 5.8, 5.3, and 20.5 nm, respectively. A total of 154 chains were placed parallel to the z-axis and are bonded across the periodic boundaries, representing an infinite lamellar thickness. The system were simulated in the anisotropic NPT ensemble for 50 ns at 1 atm, and various temperatures, namely 240, 270, 300, 330, and 370 K. Mean–squared displacements were measured to quantify diffusion. The diffusion coefficients were calculated with $D=\frac{g(t)}{2t}|_{t=50\mathrm{ns}}$ and the activation energy $E_{a}$ was calculated with the Arrhenius relationship. The results are shown in Figure 6.

Figure 6(a), (b), and (c) show the relationship between the activation energy and the $w_{r}$ , $w_{l}$ , and $w_{\theta}$ , respectively. While increasing $w_{r}$ also tends to reduce the activation energy of crystalline diffusion, $w_{l}$ and $w_{\theta}$ demonstrate a stronger correlation with $E_{a}$ , i.e., potentials from larger bandwidth trainings have a reduced activation energy of crystalline diffusion. We attribute the lower activation energy to the reduced energy barriers in $U(l,\theta)$ caused by stronger smoothing from larger bandwidths. It was shown that the defect movements are the agents of the chain–direction slip (or chain diffusion) in the crystal phases and account for the $\alpha_{c}$ -relaxation.³⁹ This indicates that the state transitions of torsion angles in local chain segments, which form conformational defects and enable defect movements, would also facilitate diffusion. The torsion angle transitions in the atomistic representation are implicitly represented by the state transitions of bond–angle triplets in our CG mapping as shown in Figure 1, meaning that energy barrier between the local conformation states in $U(l,\theta)$ strongly affects the activation energy of crystalline diffusion.

We then identified the energy barrier $E_{b}$ of the BA potentials. More specifically, for a given BA potential, we first found all the local minima (states) in $U(l,\theta)$ and identified the trans state $T$ . Then, we identified the minimum energy pathway from $T$ to all other states with the algorithm developed by Fu et al.⁴⁰ and calculated the maximum energy on the paths, which were regarded as the energy barriers of the paths. Eventually, the average energy barriers of all the paths are used as the energy barrier $E_{b}$ of the BA potential.

The relationship between the activation energy $E_{a}$ and the energy barrier $E_{b}$ is shown in Figure 6(d), where the data points follow a straight line. It verifies our hypothesis that the energy barrier in $U(l,\theta)$ correlates strongly with the activation energy $E_{a}$ ; and the reduction in the activation energy observed in crystalline diffusion $E_{a}$ is accounted by the decrease in $E_{b}$ due to stronger smoothing. Because the energy landscape is smoothened by the coarse–graining, the activation energy is lowered when compared with the experimental values of 5 to 22 kcal/mol³⁹, and 23 to 30 kcal/mol⁴¹, which will facilitate changes in local conformations. However, the controllability on $E_{b}$ provided by the bandwidth parameters offers a useful tool to adjust $E_{a}$ of CG models for different applications.

To understand how the bandwidth parameters can be used to accelerate crystallization kinetics to allow studies to be conducted at larger length and time scales, we simulated the crystallization process with 5 randomly generated systems using the trained CG potentials. For each system, 280 chains with 285 CG beads representing a molecular weight of 7980 g/mol were randomly placed in a cubic simulation box with side lengths of 10.6 nm. The chain length was selected so that polymer chains can crystallize into large crystallites and chains can pass through a crystallite multiple times and form loops. We then ran simulations in the anisotropic NPT ensemble at 300 K, 1 atm for 100 ns, during which the systems crystallized.

The crystallization of semicrystalline polymer consists of two processes. In primary crystallization, which is commonly modeled with the Avrami equation⁴², crystallites grow radially from nuclei and form spherulites. This process majorly consists of the growth of lamellae in the direction perpendicular to the stems (transverse direction). In secondary crystallization, lamellae thicken by reorganizing the polymer chains in adjacent amorphous phases in a manner similar to chain slips. We identified the characteristics of crystallites with the following procedures. After dividing the simulation boxes into bins of 7 Å, we calculated the $p_{2}$ parameter of the beads with a cutoff radius of 14 Å, and then identified the crystallites by finding the contiguous clusters of bins with average $p_{2}$ parameter greater than 0.4. To measure the secondary crystallization rate, we then calculated the average stem length in crystallites. To quantify the crystallite growth in the transverse direction, cross–sectional area of crystallites in the transverse plane were also estimated. The results are reported in Figure 7.

The crystallite growth follows a typical polymer crystallization process. The transverse growth starts and saturates quickly, providing the initial growth of crystallites, combined with stem growth that is responsible for the second–stage crystallite growth. Such a trend can also be observed from the rendered images in Figure 7(d). In image (i) and (ii), the system evolved from two nuclei to multiple small crystallites, meaning multiple nucleation sites formed which indicates sufficiently large domain size. The primary crystallization dominates the process from image (ii) to image (iii) which illustrates the system state only 2 ns later. Image (iv) demonstrates a fully crystallized state and after which no significant change was observed. The half–time of crystallization, defined as the time to reach 50% of relative crystallinity, is about 10 ns; when comparing to the experimental results by Zhuravlev et al.⁴⁴, we can roughly estimate that the acceleration is at most 100 times. More accurate estimation is impossible due to missing data for isothermal crytallization of polyethylene at 300 K. The CG model with accelerated crystallization kinetics is by no means ideal to reproduce realistic time scale, but does provide a viable and efficient approach to study the crystallization processes.

Multiple models are available to describe crystallite growth. The Avrami equation⁴² is commonly used to model crystallization kinetics, defined as:

\chi_{c}/\chi_{c,\infty}=1-\exp(-kt^{n}_{1})

(31)

where $\chi_{c}$ and $\chi_{c,\infty}$ denote the crystallinity and final crystallinity, $k$ is the crystallization rate, and $n$ is the Avrami exponent. The Avrami model assumes isotropic growth hence can not account for the secondary crystallization process. Among the works^{45, 46, 47, 48, 49, 50, 51, 52} that attempt to take the secondary crystallization into consideration, the Hay model^{53, 54} and Velisaris–Seferis model⁴³ were shown to generate the best fit according to Kelly and Jenkins.⁵⁵ We therefore selected the Velisaris–Seferis model because it converges to a finite value at infinite time. It is defined as:

\chi_{c}/\chi_{c,\infty}=w_{1}(1-e^{-k_{1}t^{n}_{1}})+w_{2}(1-e^{-k_{2}t^{n}_{2}})

(32)

where $k_{1}$ and $k_{2}$ signify the primary and secondary crystallization rates, respectively, $\chi_{c}$ denotes the crystallinity, $\chi_{c,\infty}$ is the final crystallinity, $w_{1}$ and $w_{2}$ are the weights of primary and secondary crystallization and sum to one, $n_{1}$ and $n_{2}$ are the Avrami exponents for primary and secondary crystallization. We selected $n_{1}=2$ and $n_{2}=1$ since the primary crystallization and the secondary crystallization are volume nucleation with two–dimensional and one–dimensional growth, respectively. We fitted the data in Figure 7(a) with the Velisaris–Seferis model to analyze the crystal growth. Naturally, the data shown in Figure 7(b) and (c) are fitted with the Avrami model. To distinguish the fitted parameters, the transverse growth rate is denoted by $k_{A}$ and the stem growth rate is denoted by $k_{l}$ , which has the same physical meaning as $k_{1}$ and $k_{2}$ , respectively. All the fittings matched with the data points well as shown in Figure 7 and the parameters are reported in Figure 8.

It can be observed that the trends in the row of $k_{l}$ and $k_{A}$ agree well with the row of $k_{2}$ and $k_{1}$ (e.g., the trend in Figure 8(a) is similar to the one in Figure 8(d)), respectively, which validates the fittings and post–processing since the parameters with the same physical implications are affected similarly. When considering the time to reach a certain relative crystsallinity level, the primary crystallization is generally 2 to 5 times faster than the secondary crystallization, consistent with the conclusion by Banks et al.⁵⁶ that secondary crystallization is less than one order of magnitude slower than the primary crystallization. The parameter $w_{r}$ reduces the secondary crystallization rate when $w_{r}$ is large, due to the deeper pair potential well in the ( $8w_{r}$ , $w_{l}$ , $w_{\theta}$ ) case posing higher energy barrier to chain slips. $w_{l}$ has little effect on the crystallization kinetics in the investigated range. Increasing $w_{\theta}$ greatly increases $k_{A}$ and $k_{1}$ ; the increase of $k_{A}$ and $k_{1}$ is attributed to a weaker constraint at $\theta=\pi$ resulting from larger $w_{\theta}$ , which facilitates the folding of the chains and accelerates the crystallite growth in the transverse direction.

4 Conclusions

In this work, we developed an extension to the IBI method that considers correlated multi–variable intramolecular interactions. This method also incorporates the KDE representation in distributions and energy formulations, which provides controllability to the smoothness of the resulting potential functions and the characteristics of the resulting models.

We trained 10 CG models of PE with different kernel bandwidths using this method. The models successfully reproduced the correlated distributions of bond–length and bond–angle of PE with an $L_{2}$ norm error of about 1%. The conformation states of bond–angle triplets, which represents the underlying atomistic conformations, were reproduced with appropriate bandwidths. We found that the activation energy of crystalline chain diffusion of the CG model is determined by the energy barrier $E_{\mathrm{b}}$ between the conformational states. By adjusting the bandwidths of the target distributions, we could tune the magnitude of the peaks and crests in the $l-\theta$ energy landscape, giving us controllability towards the system dynamics. Increasing the bandwidths will accelerate the dynamics by smoothing out the peaks and crests at the cost of weakened ability to accurately reproduce the local conformations; it will also reduce the stiffness of the BA potential and the maximum frequency of the model, leading to increased critical time step and computational performance. By adjusting the bandwidth, one could focus on accurately reproducing the local conformations, or efficiently cover long time scale in simulations

In the study of isothermal crystallization of supercooled PE, the models successfully crystallized the systems and reproduced the primary and secondary crystallization kinetics. The model trained with larger $w_{r}$ would slow down the secondary crystallization rates because deep pair potential well creates resistance for chain slip, whereas $w_{l}$ have little effect on the crystallization kinetics within the investigated range. Additionally, increasing $w_{\theta}$ greatly increases the primary crystallization rate, indicating the flexibility of chains has a significant impact on the crystallite growth in the transverse direction. The CG model presented in this work shows controllable characteristics in diffusion dynamics and crystallization kinetics. We look forward to further testing the capability of the model in simulating mechanical properties.

5 Acknowledgements

The authors acknowledge support from the National Science Foundation (NSF) under award Division of Civil, Mechanical, and Manufacturing Innovation (CMMI) 1653830. The authors acknowledge Research Computing at Arizona State University for providing HPC resources that have contributed to the research results reported within this paper.

References

Nielsen et al. 2004 Nielsen, S. O.; Lopez, C. F.; Srinivas, G.; Klein, M. L. Coarse grain models and the computer simulation of soft materials. J. Phys. Condens. Matter 2004, 16, R481
Peter and Kremer 2009 Peter, C.; Kremer, K. Multiscale simulation of soft matter systems–from the atomistic to the coarse-grained level and back. Soft Matter 2009, 5, 4357–4366
Liu and Oswald 2019 Liu, M.; Oswald, J. Coarse–grained molecular modeling of the microphase structure of polyurea elastomer. Polymer 2019, 176, 1–10
Liu et al. 2023 Liu, M.; Ye, J.; Oswald, J. Coarse-grained molecular simulation of the role of curing rates on the structure and strength of polyurea. Comput. Mater. Sci. 2023, 230, 112428
Salerno et al. 2016 Salerno, K. M.; Agrawal, A.; Perahia, D.; Grest, G. S. Resolving dynamic properties of polymers through coarse-grained computational studies. Phys. Rev. Lett. 2016, 116, 058302
Hoy and Karayiannis 2013 Hoy, R. S.; Karayiannis, N. C. Simple model for chain packing and crystallization of soft colloidal polymers. Phys. Rev. E 2013, 88, 012601
Nguyen et al. 2015 Nguyen, H. T.; Smith, T. B.; Hoy, R. S.; Karayiannis, N. C. Effect of chain stiffness on the competition between crystallization and glass-formation in model unentangled polymers. J. Chem. Phys. 2015, 143, 144901
Paul et al. 1995 Paul, W.; Yoon, D. Y.; Smith, G. D. An optimized united atom model for simulations of polymethylene melts. J. Chem. Phys. 1995, 103, 1702–1709
Waheed et al. 2002 Waheed, N.; Lavine, M.; Rutledge, G. Molecular simulation of crystal growth in n-eicosane. J. Chem. Phys. 2002, 116, 2301–2309
Waheed et al. 2005 Waheed, N.; Ko, M.; Rutledge, G. Molecular simulation of crystal growth in long alkanes. Polymer 2005, 46, 8689–8702
Yi et al. 2013 Yi, P.; Locker, C. R.; Rutledge, G. C. Molecular dynamics simulation of homogeneous crystal nucleation in polyethylene. Macromolecules 2013, 46, 4723–4733
Zubova et al. 2017 Zubova, E.; Strelnikov, I.; Balabaev, N.; Savin, A.; Mazo, M.; Manevich, L. Coarse-grained polyethylene: 1. The simplest model for the orthorhombic crystal. Polym. Sci. Ser. A 2017, 59, 149–158
Strelnikov et al. 2017 Strelnikov, I.; Zubova, E.; Mazo, M.; Manevich, L. Coarse-grained polyethylene: Including cross terms in bonded interactions and introducing anisotropy into the model for the orthorhombic crystal. Polym. Sci. Ser. A 2017, 59, 242–252
Lavine et al. 2003 Lavine, M. S.; Waheed, N.; Rutledge, G. C. Molecular dynamics simulation of orientation and crystallization of polyethylene during uniaxial extension. Polymer 2003, 44, 1771–1779
Yamamoto 2004 Yamamoto, T. Molecular dynamics modeling of polymer crystallization from the melt. Polymer 2004, 45, 1357–1364
Ko et al. 2004 Ko, M. J.; Waheed, N.; Lavine, M. S.; Rutledge, G. C. Characterization of polyethylene crystallization from an oriented melt by molecular dynamics simulation. Journal of Chemical Physics 2004, 121, 2823–2832
Yamamoto 2013 Yamamoto, T. Molecular dynamics of polymer crystallization revisited: Crystallization from the melt and the glass in longer polyethylene. Journal of Chemical Physics 2013, 139, 54903
Paajanen et al. 2019 Paajanen, A.; Vaari, J.; Verho, T. Crystallization of cross-linked polyethylene by molecular dynamics simulation. Polymer 2019, 171, 80–86
Sliozberg et al. 2018 Sliozberg, Y. R.; Yeh, I.-C.; Kröger, M.; Masser, K. A.; Lenhart, J. L.; Andzelm, J. W. Ordering and crystallization of entangled polyethylene melts under uniaxial tension: a molecular dynamics study. Macromolecules 2018, 51, 9635–9648
Larini et al. 2010 Larini, L.; Lu, L.; Voth, G. A. The multiscale coarse-graining method. VI. Implementation of three-body coarse-grained potentials. J. Chem. Phys. 2010, 132, 164107
Poursina et al. 2011 Poursina, M.; Bhalerao, K. D.; Flores, S. C.; Anderson, K. S.; Laederach, A. Strategies for articulated multibody-based adaptive coarse grain simulation of RNA. methods Enzymol. 2011, 487, 73–98
Munson and Singh 1997 Munson, P. J.; Singh, R. K. Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci. 1997, 6, 1467–1481
Li and Liang 2005 Li, X.; Liang, J. Geometric cooperativity and anticooperativity of three-body interactions in native proteins. Proteins: Struct., Funct., Bioinf. 2005, 60, 46–65
Ejtehadi et al. 2004 Ejtehadi, M.; Avall, S.; Plotkin, S. Three-body interactions improve the prediction of rate and mechanism in protein folding models. Proc. Natl. Acad. Sci. 2004, 101, 15088–15093
Krishna et al. 2010 Krishna, V.; Ayton, G. S.; Voth, G. A. Role of protein interactions in defining HIV-1 viral capsid shape and stability: a coarse-grained analysis. Biophys. J. 2010, 98, 18–26
Schapotschnikow and Vlugt 2009 Schapotschnikow, P.; Vlugt, T. J. Understanding interactions between capped nanocrystals: Three-body and chain packing effects. J. Chem. Phys. 2009, 131, 124705
Fukunaga et al. 2002 Fukunaga, H.; Takimoto, J.-i.; Doi, M. A coarse-graining procedure for flexible polymer chains with bonded and nonbonded interactions. J. Chem. Phys. 2002, 116, 8183–8190
Reith et al. 2003 Reith, D.; Pütz, M.; Müller-Plathe, F. Deriving effective mesoscale potentials from atomistic simulations. J. Comput. Chem. 2003, 24, 1624–1636
McCabe et al. 2014 McCabe, P.; Korb, O.; Cole, J. Kernel Density Estimation Applied to Bond Length, Bond Angle, and Torsion Angle Distributions. J. Chem. Inf. Model. 2014, 54, 1284–1288
Li et al. 2019 Li, Y.; Agrawal, V.; Oswald, J. Systematic coarse-graining of semicrystalline polyethylene. J. Polym. Sci. Part B Polym. Phys. 2019, 57, 331–342
Ye and Oswald 2022 Ye, J.; Oswald, J. CGBA-potentials. 2022; \urlhttps://doi.org/10.5281/zenodo.6590569
Ye and Oswald 2022 Ye, J.; Oswald, J. ba-probability. 2022; \urlhttps://doi.org/10.5281/zenodo.6590571
Swan 1960 Swan, P. R. Polyethylene specific volume, crystallinity, and glass transition. J. Polym. Sci. 1960, 42, 525–534
Chiang and Flory 1961 Chiang, R.; Flory, P. Equilibrium between Crystalline and Amorphous Phases in Polyethylene1. J. Am. Chem. Soc. 1961, 83, 2857–2862
Agrawal et al. 2016 Agrawal, V.; Peralta, P.; Li, Y.; Oswald, J. A pressure-transferable coarse-grained potential for modeling the shock Hugoniot of polyethylene. J. Chem. Phys. 2016, 145, 104903
Maple et al. 1988 Maple, J. R.; Dinur, U.; Hagler, A. T. Derivation of force fields for molecular mechanics and dynamics from ab initio energy surfaces. Proc. Natl. Acad. Sci. U. S. A. 1988, 85, 5350–5354
Plimpton 1995 Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 1995, 117, 1–19
Wang et al. 2009 Wang, H.; Junghans, C.; Kremer, K. Comparative atomistic and coarse-grained study of water: What do we lose by coarse-graining? Eur. Phys. J. E 2009, 28, 221–229
Mowry and Rutledge 2002 Mowry, S. W.; Rutledge, G. C. Atomistic simulation of the $\alpha_{c}$ -Relaxation in crystalline polyethylene. Macromolecules 2002, 35, 4539–4549
Fu et al. 2020 Fu, H.; Chen, H.; Wang, X.; Chai, H.; Shao, X.; Cai, W.; Chipot, C. Finding an optimal pathway on a multidimensional free-energy landscape. J. Chem. Inf. Model. 2020, 60, 5366–5374
Ashcraft and Boyd 1976 Ashcraft, C. R.; Boyd, R. H. A dielectric study of molecular relaxation in oxidized and chlorinated polyethylenes. J. Polym. Science, Polym. Phys. Ed. 1976, 14, 2153–2193
Avrami 1939 Avrami, M. Kinetics of phase change. I General theory. J. Chem. Phys. 1939, 7, 1103–1112
Velisaris and Seferis 1986 Velisaris, C. N.; Seferis, J. C. Crystallization kinetics of polyetheretherketone (PEEK) matrices. Polym. Eng. Sci. 1986, 26, 1574–1581
Zhuravlev et al. 2016 Zhuravlev, E.; Madhavi, V.; Lustiger, A.; Androsch, R.; Schick, C. Crystallization of polyethylene at large undercooling. ACS Macro Lett. 2016, 5, 365–370
Hillier 1965 Hillier, I. Modified avrami equation for the bulk crystallization kinetics of spherulitic polymers. J. Polym. Sci. Part A Polym. Phys. 1965, 3, 3067–3078
Ravindranath and Jog 1993 Ravindranath, K.; Jog, J. Polymer crystallization kinetics: Poly (ethylene terephthalate) and poly (phenylene sulfide). J. Appl. Polym. Sci. 1993, 49, 1395–1403
Tobin 1974 Tobin, M. C. Theory of phase transition kinetics with growth site impingement. I. Homogeneous nucleation. J. Polym. Science, Polym. Phys. Ed. 1974, 12, 399–406
Tobin 1976 Tobin, M. C. The theory of phase transition kinetics with growth site impingement. II. Heterogeneous nucleation. J. Polym. Science, Polym. Phys. Ed. 1976, 14, 2253–2257
Tobin 1977 Tobin, M. C. Theory of phase transition kinetics with growth site impingement. III. Mixed heterogeneous–homogeneous nucleation and nonintegral exponents of the time. J. Polym. Science, Polym. Phys. Ed. 1977, 15, 2269–2270
Malkin et al. 1984 Malkin, A. Y.; Beghishev, V.; Keapin, I. A.; Bolgov, S. General treatment of polymer crystallization kinetics—part 1. A new macrokinetic equation and its experimental verification. Polymer Engineering & Science 1984, 24, 1396–1401
Urbanovici and Segal 1990 Urbanovici, E.; Segal, E. New formal relationships to describe the kinetics of crystallization. Thermochimica acta 1990, 171, 87–94
Urbanovici et al. 1996 Urbanovici, E.; Schneider, H.; Brizzolara, D.; Cantow, H. Isothermal melt crystallization kinetics of poly (L-lactic acid). J. Therm. Anal. Calorim. 1996, 47, 931–939
Chen et al. 2016 Chen, Z.; Hay, J. N.; Jenkins, M. J. The effect of secondary crystallization on crystallization kinetics–Polyethylene terephthalate revisited. European Polymer Journal 2016, 81, 216–223
Phillipson et al. 2016 Phillipson, K.; Jenkins, M. J.; Hay, J. N. The effect of a secondary process on crystallization kinetics–Poly ( $\epsilon$ -caprolactone) revisited. European Polymer Journal 2016, 84, 708–714
Kelly and Jenkins 2022 Kelly, C. A.; Jenkins, M. J. Modeling the crystallization kinetics of polymers displaying high levels of secondary crystallization. Polym. J. 2022, 54, 249–257
Banks et al. 1963 Banks, W.; Gordon, M.; Roe, R.; Sharples, A. The crystallization of polyethylene I. Polymer 1963, 4, 61–74