An amended approximation of the non-Gaussian probability distribution function

Network models of rubber elasticity are based on the conformational entropy of an idealized chain and mostly motivated by the non-Gaussian statistical theory by Kuhn and Grün. However, the non-Gaussian probability distribution function cannot be expressed in a closed form and requires an approximation. All such approximations applied in the literature demonstrate pronounced inaccuracies in comparison to the analytical solution. The ideal choice of the approximation function depends on a variety of factors, such as the chain parameters or the desired application of the approximation (the probability distribution function itself, the corresponding entropic energy, or the force developed by the chain). In addition, when making a choice regarding the best approximation for a given application, the applied error measure plays a significant role since the approximation that grants, for example, the best maximal relative error is not necessarily the same that provides the best mean absolute error. In the literature, this application-specific evaluation of available approximations is commonly disregarded. In this paper, we evaluate previously proposed approximations on the application-specific basis and develop an approach to derive a family of approximations for the free energy of a polymer chain in a broader range of the number of its chain segments. The analytical method based on the Padé technique delivers an approximation of the non-Gaussian probability distribution function that can be easily tailored depending on the desired application. The proposed approach is capable to provide much stronger predictions in comparison to the Kuhn and Grün model in a wide range of chain segment numbers.


Introduction
A statistical theory of a single long-chain polymer molecule is an essential basis for developing a micromechanical model for rubbers. Traditional molecular simulations take into account all chemical aspects at the molecular level, and are therefore very powerful and comprehensive [1]. However, this detailed information is essential only in microscopic dimensions or small-time ranges of up to 1 ms. Due to the computational expenses and limitations to size and calculation time of such simulations, statistical averaging to larger scales is of interest. The majority of the rubber elasticity network models are based on the statistical theory of an idealized chain and in particular on the freely jointed chain (FJC) model. It is the most simplified representation of a real polymer molecule that can randomly fluctuate in space.
In polymer physics, the probability distribution function (PDF) of an FJC with fixed end positions can be described by the random walk theory. The random walk theory describes a random movement of a particle where a direction in each step is independent of all previous steps. If the length of each step is constant, the particle trajectory is one of all possible conformations of the FJC model. The conformational entropy s of the chain is directly related to the PDF and, thus, to the end-to-end distance R by the Boltzmann equation where k B denotes the Boltzmann constant. The free energy W of an ideal polymer chain is derived from its conformational entropy, as where T denotes the absolute temperature. Thus, from a point of view of constitutive modeling, finding an approximation for P(R) that leads to a minimized error in ln½P(R) is of larger interest than minimizing the approximation error of P(R) itself. The first application of the random walk theory to polymer chain statistics was independently proposed by Kuhn [2] and Guth [3]. Kuhn and Gru¨n [4] introduced the inverse Langevin approximation for single-chain statistics for large deformations which appears accurate for relatively long polymer chains. Many constitutive models of rubber elasticity incorporated the Kuhn-Gru¨n approximations of the non-Gaussian theory [5][6][7][8]. These are, for example, the three-chain [9], the four-chain [10], the eightchain [5], the micro-macro unit sphere model [7], and full-network models [8,11,12]. However, deviations of the Kuhn-Gru¨n model from the exact distribution function in the cases of large extensibility and short chains have motivated to search for more accurate alternatives [9,13,14]. Moreover, the inverse Langevin function cannot be represented in an explicit form and requires an approximation, which increases the deviation from the exact distribution function.
As a result, all models proposed so far suffer from the inaccuracies in the Kuhn-Gru¨n approximation. While some approximations outperform the others in the majority of cases, there is no explicitly superior approximation. The ideal choice of the approximation function depends on a variety of factors, such as the chain parameters, or the desired application of the approximation (the probability distribution function itself, the corresponding free energy, or the entropic force developed by the chain). In addition, when making a choice regarding the best approximation for a given application, the applied error measure plays a significant role since the approximation that grants, for example, the best maximal relative error is not necessarily the same that provides the best mean absolute error.
It is hence reasonable to choose an approximation that suits the error measure deemed most critical for the desired application. In this regard, it appears questionable to take one specific approximation as the final solution that fits all conditions. All previously proposed approximations perform well only for a particular number of chain segments while the effect of n remains undiscussed. In this paper, we develop an approach to derive a family of approximations for the free energy of a polymer chain with a broader range of chain segments.

Statistics of a single chain
Chain-like molecules such as a single polymer chain in a dilute polymer solution consist of relatively small repeating units (monomers). They are covalently bonded together at some position referred to as cross-linking junctions. In an equilibrium state, the polymer chain can take a definite number of configurations between two end points.
By ignoring the variation of the length of a valence bond from its mean value due to atomic vibrations in the chain molecules [15], one can assume that each bond has a constant length l. The distribution function P(R) of the end-to-end distance vector R subject to the following condition (see Figure 1) can be expressed as [16] If the mean end-to-end distance jjRjj = ffiffiffiffiffiffiffiffiffiffi ffi R Á R p is much less than the maximum extension of the chain (jjRjj ( nl), equation (4) reduces to the Gaussian distribution function Most of the problems in polymer physics are treated based on the Gaussian chain model (5), which is physically unreasonable for real chains [15] in particular in the case of large deformations.
The probability distribution function is subject to the normalization condition generally written by [17] ð ' Equation (4) is given in the integral form and is thus not convenient for practical applications and numerical computation, particularly for large values of n. Rayleigh [18] obtained a solution for small values of n, e.g., n = 6. Chandrasekhar [19] evaluated the integral equation (4) for any finite value of n separately. Nevertheless, the calculations become very tedious for larger values of n. There have been several attempts to get an exact solution for equation (4). Treloar [20], Wang and Guth [9], Nagai [21], and Hsiung et al. [22] represented it in form of a series as which is also inconvenient for practical calculation, especially for large values of n. For this reason, it is usually approximated by a closed-form expression. In the following section, the most well-known approximations of the exact probability distribution function are discussed and compared with respect to their accuracy.

Appraisal and comparison of different approximations
In order to avoid tedious and computationally expensive calculation of the integral form (4) or its series expansion (7), several approximations based on different methods and approaches have been proposed. It is clear that a robust and comprehensive approximation must be capable of fitting to the exact solution for all ranges of the variables jjRjj, l, and n or at least to their physical ranges for a specific application. In order to verify the effects of these parameters, we introduce an end-to-end extension ratio t [ R=nl such that 0 ł t ł 1, and where the abbreviation R = jjRjj is used. In the following, different methods approximating the exact solution are discussed and compared.

Asymptotic analysis
Based on the asymptotic analysis [23], Kuhn and Gru¨n [4], James and Guth [24], and Flory and Rehner [10] obtained an approximation for equation (4) for the case n ) 1 and t ( 1. Their results are the same but follow distinctive approaches resulting from the non-Gaussian distribution with limited chain extension. The Kuhn-Gru¨n approximation is given by where L À1 (t) represents the inverse of the Langevin function defined as L(x) = coth (x) À 1=x. An accurate approximation of this function can be based on a high-order series expansion (see, e.g., Itskov et al. [25]), on a non-rational or a rational function. Recently, a number of approximations have been proposed for it (26)(27)(28). In the limit of n ! ', equation (8) results in the Gaussian distribution (5). An amended Kuhn-Gru¨n approximation proposed by Jernigan and Flory [14] aims to describe polymeric network chains in the context of rubber elasticity at high extensions t'1 as where C aKG denotes the normalization factor satisfying equation (6).

Steepest descent method
The steepest descent or saddle-point method is an extension of Laplace's method for an integral approximation. By the steepest decent, Wang and Guth [9] obtained the solution of equation (4) for n ) 1 as with q(t) = (k 4 À 3nk 2 2 )=8k 2 2 , where k 2 and k 4 are given by In a similar way, Yamakawa [16] also derived the following expression where csch represents the cosecans hyperbolicus. It should be noted that equation (12) is valid over the whole range of 0 ł t ł 1.

Series expansions
Most of the proposed approximations focus mainly on the simplified and specific solutions of the exact problem like series for the inverse of Langevin function. Most of the Taylor series expansions are developed around t = 0 and are only valid for limited extensions of the polymer chain. In case of n ) 1 (very long chains), the Taylor series for ( sin x=x) n is written as where B k denote the Bernoulli numbers. Using equation (13), equation (4) becomes + 1015 À 9660 R 2 n + 13734 R 4 n 2 À 5364 R 6 n 3 + 567 R 8 n 4 5600n 2 À 9 315 À 1260 R 2 n + 1134 R 4 n 2 À 324 R 6 n 3 + 27 R 8 n 4 + :::

Other approximations
Other approximations proposed in the literature are either the modified and enhanced version of the aforementioned approximations or formulated based on a combination of the previous approaches. For a model corresponding to a random walk on a diamond lattice, James and Guth [24] obtained an expression (unpublished but mentioned by Wang and Guth [9]) for the case of perfectly flexible chains as where A = ffiffi ffi 3 p t and f must be extracted from Malacarne et al. [29] proposed an approximation based on the non-Gaussian Tsallis distribution which is an improved version of the Gaussian model. Their proposal is efficient and simple to implement for a large number of monomers. Accordingly, the normalized distribution takes the form as where G represents the gamma function. Ilg et al. [13] proposed an estimation based on a one-dimensional version of a finitely extensible dumbbell model. The proposal utilizes trigonometric functions to approximate the inverse Langevin function in the formula by Kuhn and Gru¨n [4] as where C Ilg is the normalization factor. Recently, Khieˆm and Itskov [30] revised the approximation by Ilg, equation (18), as where C KI denotes the normalization factor. Very recently, Morovati and Dargazany [31] proposed multiplicative corrections of the Kuhn-Gru¨n approximation (8) as Note that all proposed approximation functions must satisfy the normalization condition (6).

Comparison and discussion of existing approximations
The above discussed approximations are compared in Figure 2 for the case of n = 10. Both the normalized PDFs and the corresponding normalized entropic energies are shown along with their respective relative and absolute errors. Generally, relative errors tend to provide a better assessment of the approximative performance. However, this is not the case for the entire range of R=nl as the exact PDF and all its approximations converge to zero with R=nl approaching one. This leads to extremely small values and large relative errors. In cases like these, the relative error is not meaningful. Hence, it is reasonable to judge the approximative capabilities by utilizing the absolute and relative errors side-by-side. Accordingly, the steepest descent, equation (12), and the three-term Morovati and Dargazany approximation, equation (21), significantly outperform the others for the example case of n = 10. However, this conclusion cannot be generalized for arbitrary n without further investigation. All previously proposed approximations have been proven to perform well in single cases chosen by the respective authors leaving other important domains of n undiscussed. Thus, a deeper investigation of the approximations over a reasonable range for n is necessary to ensure proper comparability.
In the following, all approximations discussed in the previous sections are evaluated for every integer n within n 2 ½3, 120. For each case, the absolute and relative errors were computed. The mean and the maximum of these errors are plotted in Figure 3. Extreme extensibility R=nl ! 1 is less important for practical applications as the chain force tends to infinity. Therefore, in this study, the errors are only evaluated in a range limited to R=nl\0:95. This limitation allows the comparison of different models in a more practically relevant and realistic way.
According to the results plotted in Figure 3, the approximative qualities of the approaches vary significantly for different n. While there are some approximations that outperform the others in the majority of cases, there is no clear superior approximation in the whole domain and with respect to every considered error measure. In order to find the most suitable approximation, one hence has to base the choice on the specific value of n of the polymer chain studied, whether the PDF itself or the corresponding entropic energy should be approximated, and the error measure deemed most important for the considered application. Since the exact solution (7) explicitly depends on the choice of n, a suitable approximation can be expected to do as well. However, all previously discussed approximations lack this feature.

An improved approximation based on a Padé approximant
Many previously discussed approximations share a common structure containing of up to three parts. The core of these approximations is the Kuhn-Gru¨n approximation P KG (t) corrected by a factor K cor (t, n). Finally, the normalization condition (6) is enforced using a scaling factor C, which results in a generalized approximation formula as P t, n ð Þ= CP KG t ð ÞK cor t, n ð Þ: ð22Þ In the present work, a correction function K cor (t, n) based on a Pade´approximation is proposed. For a given function f (t), the Pade´ [32][33][34]

approximant of order [L/M] is a rational function of the form
where L and M denote the integers determining the order of the polynomials in the nominator and denominator, respectively. The choice of q 0 = 1 does not limit the generality but provides uniqueness of the Pade´approximant for given L and M.
In the present work, we set L = 3 and M = 2, and consider the maximal extensibility of a polymer chain at t = 1 (where P ! 0). Accordingly, a PDF based on the Pade´approximation can be given by P t, n ð Þ= CP KG t ð ÞK cor t, n ð Þ, K cor t, n ð Þ= a n ð Þt 3 + b n ð Þt 2 + c n ð Þt + 1 with three parameters a, b, and c each depending on n. C can be calculated in a straight-forward fashion using the normalization condition (6). For convenience, C can also be factored into K cor (t, n), which leads to q 0 = C and linearly scales a, b, and c. Practically, this can be achieved by either optimizing for a, b, and c and then calculating equation (6) or directly applying the normalization as boundary condition while optimizing for a, b, c, and C. Both procedures will provide the same solution.
Usually, the parameters a, b, and c are obtained from partial sums of the Taylor series expansion around some point [35]. This is not feasible in the present context, where the function f (t) to approximate is not given in an explicit form and hence cannot be utilized. For this reason, the Levenberg-Marquardt method [36,37] implemented in MATLAB [38] was applied to obtain the parameters a, b, and c for every particular value of n. The specific target function for the optimization procedure has a special significance discussed in the following. While a least-squares-based approach targeting the PDF itself may be an obvious choice, it is neither the only nor necessarily the best one. Depending on the desired application, minimizing the relative error or the mean/maximum of one of the error norms can be advantageous. For example, it appears reasonable to focus on the optimization of ln (P) rather than on P itself due to the higher significance of the normalized energy compared to the pure PDF in the context of constitutive modeling of polymers.
For the sake of this exemplary study, we minimized the least-squares of the absolute error of ln (P). The so resulting values for a, b, and c are shown in Figure 4 as functions of n. This allows for perfect tailoring of the approximation to a specific application. In the relevant range of n, the parameters a, b, and c show a behavior, which can be easily fitted to simple functions, allowing quick and easy calculation (without any new optimization). The corresponding functions, as depicted in Figure 4, can be represented as a n ð Þ = À 0:42412 + 0:44

Results
In the following, the Pade´approximation-based approach featuring the functional dependence on n, as proposed in the previous section, is compared to the steepest descent approach and both the two-term and the three-term Morovati and Dargazany approximation which revealed the best performance in comparison to other approximations in a wide range of n. In Figure 5, the maximal absolute error of ln (P) and the maximal relative error of P are plotted versus n. One observes that the proposed approximation demonstrates the best performance in the whole range of n. In addition, the proposed approach leads to an error less dependent on n. While the advantage is especially prominent for the absolute error of ln (P), which was set as the optimization target, it is still present in the PDF itself. Except for very short chains (n \ 7), the relative error of the PDF is never higher than 1.3% and the absolute error of ln (P) never exceeds 0:01.
In a wide range of n, the proposed Pade´approximation-based approach is capable to provide stronger predictions in comparison to the others even when used with constant a, b, and c. Figure 6 shows the resulting error, when utilized with averaged constant values for a, b, and c. While the dependency on n enhances the approximation for n\20 and 100\n, the effect is minimal for chains of medium size.  Figure 4. Parameters a, b, and c obtained for up to n = 120. The optimization target was to minimize the least-squares of the absolute error of ln (P). In addition, an approximation for each curve is given.

Conclusion
In the current work, we showed that most of the previously proposed approximations of the non-Gaussian probability distribution function (4) of the polymer chain length perform well only for a particular number of chain segments n. The effect of n on the approximation quality was investigated. Although some approximations outperform others, in the majority of cases, there is no clear superior approximation in the whole domain and for every considered error measure. The proper choice depends on the particular chain under investigation (determined by n) and the desired application. As a result, there exists no versatile unique approximation to meet all the requirements. In this paper, we derived a family of approximations of the non-Gaussian probability distribution function on the basis of the Pade´method. The developed method is able to incorporate the polymer chain length and can be tailored to any specific application. For example, formulation (24)-(27) provides (a) (b) Figure 5. Comparison of the Padé-based approximation as given in equation (24) to the best approximations from the literature. The left picture shows the absolute error of ln (P) (which was set as optimization target), while the right shows the resulting relative error of P.
the minimal absolute error of ln(P). In a wide range of n, the proposed approach appears to be the most accurate in comparison to other approximations known from the literature. A simplified approach, which neglects the dependency on the number of chain segments n, was shown to still outperform most of the alternative approximations. The proposed approximation function is very robust for the modeling of polymer blends made up of short chains and also can minimize errors in simulating multimodal polymer networks which contain a wide range of polymer chain sizes [39]. Moreover, the non-Gaussian theory of polymer networks has been widely utilized to model the swelling behavior of cross-linked gels with finite extensibility of networks of high charge densities. The accuracy of such models can be considerably enhanced by the use of the proposed approach [40,41].