A machine-learning-enabled approach for bridging multiscale simulations of CNTs/PDMS composites

Lingjie Yu; Chao Zhi; Zhiyuan Sun; Hao Guo; Jianglong Chen; Hanrui Dong; Mengqiu Zhu; Xiaonan Wang

doi:10.1360/nso/20230055

All issues

Volume 3 / No 2 (2024)

Natl Sci Open, 3 2 (2024) 20230055

Full HTML

Special Topic: AI for Chemistry

Open Access

Issue		Natl Sci Open Volume 3, Number 2, 2024 Special Topic: AI for Chemistry


Article Number		20230055
Number of page(s)		15
Section		Chemistry
DOI		https://doi.org/10.1360/nso/20230055
Published online		01 February 2024

National Science Open 3: 20230055, 2024

RESEARCH ARTICLE

A machine-learning-enabled approach for bridging multiscale simulations of CNTs/PDMS composites

Lingjie Yu¹^,2^,3, Chao Zhi², Zhiyuan Sun², Hao Guo², Jianglong Chen², Hanrui Dong², Mengqiu Zhu⁴ and Xiaonan Wang¹^*

¹ Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
² School of Textile Science and Engineering, Xi’an Polytechnic University, Xi’an 710048, China
³ State Key Laboratory of Intelligent Textile Material and Products, Xi’an Polytechnic University, Xi’an 710048, China
⁴ Materials Genome Institute of Shanghai University, Shanghai University, Shanghai 201900, China

^* Corresponding author (email: wangxiaonan@tsinghua.edu.cn)

Received: 6 September 2023
Revised: 7 January 2024
Accepted: 8 January 2024

Abstract

Benefitting from the interlaced networking structure of carbon nanotubes (CNTs), the composites of CNTs/polydimethylsiloxane (PDMS) have found extensive applications in wearable electronics. While hierarchical multiscale simulation frameworks exist to optimize the structure parameters, their wide applications were hindered by the high computational cost. In this study, a machine learning model based on the artificial neural networks (ANN) embedded graph attention network, termed as AGAT, was proposed. The datasets collected from the micro-scale and the macro-scale simulations are utilized to train the model. The ANN layer within the model framework is trained to pass the information from micro-scale to macro-scale, while the whole model is aimed to predict the electro-mechanical behavior of the CNTs/PDMS composites. By comparing the AGAT model with the original multiscale simulation results, the data-driven strategy is shown to be promising with high accuracy, demonstrating the potential of the machine-learning-enabled approach for the structure optimization of CNT-based composites.

Key words: multiscale simulation / machine learning / material property prediction / CNTs/PDMS composites

© The Author(s) 2024. Published by Science Press and EDP Sciences.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

INTRODUCTION

In contrast to rigid devices, flexible electronic devices exhibit excellent adaptability on unconventional interfaces, particularly the surface of human skin, and thus can be widely used in various scenarios, such as human motion recognition, heath monitoring, and human-machine interface [1,2]. Among them, the remarkable electrical and mechanical properties of carbon nanotubes (CNTs), together with the outstanding stretchability and flexibility of polydimethylsiloxane (PDMS), make the CNTs/PDMS composite a kind of suitable strain sensor for numerous sensing applications [3–5].

The conductivity of the CNTs/PDMS composite benefits from the interlaced networking structure of CNTs, while the response of the CNTs/PDMS sensor primarily arises from changes in its electrically conductive network under material deformation caused by external forces [6]. Hence, the microstructure, including the geometrical morphology of CNTs, the volume ratio, and the distribution of CNTs within the matrix, significantly influence the properties of the CNTs/PDMS composite. Therefore, the structure-property relations in CNTs/PDMS composites have been widely investigated [7]. Among them, the numerical simulation methods have gained popularity as they offer an efficient alternative to time-consuming experimental research [8]. Arora and Pathak [9] proposed an efficient computational methodology to predict effective orthotropic elastic properties of CNT-polymer composites at diverse constituent conditions. In this work, the Mori-Tanaka homogenization scheme has been implemented with a finite element method (FEM) approach to predict the material properties of nanocomposites. Zhang et al. [10,11] studied the mechanical properties of 3D braided composites using FEM analysis. Li et al. [12] designed simulation models through changing the content, aspect ratio and orientation degree of CNTs to investigate the electrical conductivity of CNTs/PE composites with different meso-structures.

It is worth mentioning that the property of a single CNT which offers the essential parameters in the FEM model can be obtained by using the micro-scale calculation. Numerous studies have explored how the mechanical and electrical properties of CNTs are influenced by their geometric morphology. Ebbesen et al. [13] believed that the electronic properties of CNTs were strongly modulated by their small structural variations, and measured the electrical properties of different nanotubes with diverse lengths and radii. Bao et al. [14] simulated the Young’s moduli of CNTs based on molecular dynamics (MD) simulation, while Wagner et al. [15] investigated the piezoresistive effect of CNTs within density functional theory (DFT). Wei et al. [16] presented a method for measuring the natural frequencies (f) as a function of the length (L) of individual CNTs, and the axial Young’s moduli and radial shear moduli of the CNTs were obtained simultaneously through fitting the experimental f-L data using the Timoshenko beam model. Peralta-Inga et al. [17] proposed a density functional tight-binding self-consistent charge approach to study the elastic properties of nine CNTs of different helicities and diameters.

The above reviews clearly indicate that for improving the performance of CNTs/PDMS composites, the insights from the micro-scale to macro-scale are necessary. In recent decades, the concept of multiscale modelling has emerged to describe procedures that seek to simulate continuum-scale behavior using information gleaned from computational models of finer scales in the system, rather than resorting to empirical constitutive models [18,19]. Multiscale approaches are particularly attractive in the CNTs-based composite structure due to atomic scale dependencies of a single CNT [4]. Multiscale methods have been mainly categorized into two classes: hierarchical and concurrent multiscale methods [20]. In hierarchical approaches, the molecular and macro models are simulated sequentially. To be specific, in the CNTs/polymer system, the molecular model can be utilized to calculate effective material properties, which are then passed to the FEM model to simulate the material properties on a macro-scale. Subramanian et al. [21] proposed an automictically informed stochastic multiscale model to predict the behavior of CNT-enhanced nanocomposites. In this work, the MD simulations were performed to study sub-nanoscale interactions of the CNT with the polymeric phase of the nanocomposite. Jiang et al. [22] developed a predictive constitutive model using a hierarchical multiscale approach based on molecular dynamics and the generalized interpolation material point method. The MD simulations were used to construct an elastodamage model, which was subsequently incorporated into material point methods for large-scale simulations.

However, the employment of molecular models within the FEM presents a significant computational burden, and designing the microstructure required to achieve the desired macroscopic set of properties is often intractable, due to the multiple optimization objectives, the high-dimensional and multiscale optimization space, the presence of nonlinear, stochastic, and multi-physics interactions, and the lack of governing equations for the macroscale behavior [23]. Fortunately, machine learning (ML) is an emerging field that provides high-dimensional, data-driven modeling that can map nonlinear correlations and therefore can be tailored to material discovery and optimization. ML uses a range of statistical and probabilistic approaches, allowing machine intelligence to learn from experience and to identify the hidden patterns (input-output correlations) from large and often noisy datasets, which are now seen as successful approaches for the design and discovery of new materials for a wide variety of applications [24]. Recently, ML has emerged as a powerful technique in the field of composite material [25–27]. Yuan et al. [28] proposed an axial elastic modulus degradation prediction method of [0_m/90_n]_s cross-ply laminates using an ML model. The dataset of the ML model was established based on the published experimental data and a small amount of FEM results. Liu et al. [29] proposed a hybrid ML method to predict the macroscopic thermal conductivity of CNTs-reinforced polymeric nanocomposites. Huang et al. [30] proposed a predictive model assisted by ML techniques, including artificial neural network (ANN) and support vector machine (SVM), to map the relationship between the mechanical properties of CNTs-reinforced cement composites and multiple influential factors. Le [31] developed a quick and robust computational tool based on the ML Gaussian process regression model to predict the tensile strength of CNT/polymer nanocomposites.

However, the aforementioned work mainly focused on utilizing ML-guided design approaches for predicting properties of composite materials on the macro-scale. Recently, several studies have been published using ML in multiscale modeling and simulation [20]. Xiao et al. [20,32] proposed an ML-enhanced hierarchical multiscale approach based on the dataset generated from both the MD simulations and the continuum model to study the mechanical behaviors of materials at the macroscale. Matouš et al. [33] outlooked in a review that the ML methods that seek meaningful low-dimensional structures hidden in high-dimensional multiscale data (both computational and experimental) will be important for a variety of tasks. Meanwhile, Fish et al. [18] also claimed that data-driven and ML tools have great potential to accelerate materials discovery by combining with physics-based multiscale methods. They gave an example to support their view that a recent study utilized Gaussian process metamodels informed from systematically coarse-grained MD simulations to discover optimal mechanical properties of polymer-grafted nanoparticle assemblies. Once trained and validated, such surrogate models can rapidly generate new data points by interpolating simulated outcomes, while sensitivity analyses easily reveal parameters that matter the most [34].

Based on the above literature, this study developed an ML-enabled model that offers a multiscale approach for predicting the sensing behavior of composite sensing materials, as illustrated in Figure 1. Firstly, a multiscale electro-mechanical framework was proposed for modeling the electrical resistance change of the CNTs/PDMS composite under material deformation. This framework includes both the micro- and macro-scales for both mechanical and electrical domains involved. Within this framework, the DFT calculations were initially conducted to determine the Young’s modulus of an individual CNT. The result was then passed to the FEM simulation for predicting the macro-scale sensing behaviors on the meso-structures of CNTs/PDMS composites. Subsequently, an ML model, named AGAT, was established based on the architecture of ANNs embedded graph attention networks (GAT). The embedded ANN layer, utilized for predicting the Young’s modulus from the length and radius of the CNT, acted as a bridge between the DFT calculation and the FEM simulation, referred to as the DFT-FEM (DF) module. The dataset collected from multiscale simulations and the literature was utilized to train AGAT. The well-trained ML model was finally used to predict the material sensing property, with the meso-structure of the CNT/PDMS composite serving as the input parameters.

Figure 1

The diagram of the ML-enabled multiscale approach for predicting the sensing behavior of CNTs/PDMS composites.

METHODS

The micro-scale calculation

The purpose of micro-scale calculation is to generate data samples that calculate the Young’s modulus based on the length and radius of a single CNT. The DFT calculations were performed using the VASP code. The Perdew-Burke-Ernzerhof functional within generalized gradient approximation was used to process the exchange-correlation, while the projector augmented-wave pseudopotential was applied with a kinetic energy cut-off of 500 eV, which was utilized to describe the expansion of the electronic eigenfunctions. The vacuum thickness was set to be 25 Å to minimize interlayer interactions. The Brillouin-zone integration was sampled by a Γ-centered 5 × 5 × 1 Monkhorst-Pack k-point. All atomic positions were fully relaxed until the energy and force reached a tolerance of 1 × 10⁻⁵ eV and 0.03 eV/Å, respectively. The dispersion-corrected DFT-D method was employed to consider the long-range interaction.

The macro-scale simulation

In the macro-scale simulation, the FEM was employed to generate the dataset by simulating the mechanical-electric response of CNTs/PDMS composites at diverse meso-structures. The simulation was started by constructing the geometric structure of the CNTs/PDMS composite based on four changeable parameters, namely the CNT radius, the CNT length, the volume ratio, and the CNT quantity. Establishing a comprehensive macro-scale geometry model for the composite is challenging, as it leads to a significant computational burden due to the complex micro-structure. Hence, to enhance computational efficiency, the representative volume elements (RVEs) with dimensions of 5 μm × 5 μm × 5 μm size were utilized to establish the geometry model for property evaluation. The RVE generation algorithm was developed using Python 3.6 program with the ABAQUS software. The shape of the CNT filler was treated as the cylinder and the PDMS was regarded as the matrix in the RVE model. The proposed generation algorithm, which operates within the boundaries of the RVE matrix, was used to build the fillers based on given input parameters, namely, the CNT radius, the CNT length, the volume ratio, and the CNT quantity. Owing to the van der Waals forces, CNTs within the matrix do not physically overlap or interlace. Thus, an avoidance algorithm was devised to mimic this real-world structural characteristic. The flowchart of RVE generation algorithm is depicted in Figure 2. Take the t-th CNT as an example. Firstly, the seed point of the CNT is created randomly within the matrix boundary, denoted as $A_{0}^{t}$ = (x₀, y₀, z₀); then the axis of the CNT extends from the starting point $A_{0}^{t}$ to the given length; each incremental growth step is recorded, resulting in the axis of the CNT being expressed as the set Path^t = { $A_{0}^{t}$ _, $A_{1}^{t}$ _{, …,} $A_{n}^{t}$ }, where n is in accordance with the length of the CNT. Whenever a new CNT is created, the avoidance algorithm is triggered. The overlap between two CNTs can be calculated by Eq. (1)

Figure 2

The flowchart of the RVE generation algorithm.

$\begin{array}{cr} Overlap (P a t h^{q}, P a t h^{t}) = \min (| A_{i}^{t} - A_{j}^{q} |, A_{j}^{q} \in P a t h^{q}, A_{i}^{t} \in P a t h^{t}), \end{array}$ (1)

where Path^q refers to the axis set of the q-th CNT. The t-th CNT and q-th CNT are considered overlapping when the value Overlap exceeds double of the given radius. Upon iterating through all existing CNTs, any overlapping with the newly created t-th CNT would lead to its removal. The iterative creation of CNTs continues until the quantity achieves the set value and the RVE model is finally established.

To explore the impact of the generated mesostructured configurations on the sensing property of CNTs/PDMS composites, a finite element analysis was undertaken. This analysis aimed to calculate the electrical resistance under different strain levels of 0%, 4%, 8%, 12%, 16%, and 20%, with a range of diverse RVE models employed. Taking the advantages of COMSOL in multi-physics calculation, the pre-established RVE model was imported into the COMSOL Multiphysics software for the mechanical-electrical stimulation. Notably, a tuning effect would be triggered when the distance of two individual CNTs is less than 1 nm, as depicted in Figure 3A. The simulation model was operated under the structural mechanical and AC/DC coupled module interface. Within the structural mechanics module, two matrix interfaces along the z-axis, namely the upper interface and the bottom interface, were chosen and assigned fixed constraint condition and prescribed displacement condition, respectively. The strain orientation of material was prescribed along the z-axis direction. As illustrated in Figure 3B, in the AC/DC modulus, the upper surface was designated as the voltage entry point which was set at 1 V, while the bottom surface was selected as the voltage outlet, configured with the ground boundary condition. In terms of meshing, the substantial number of CNT fillers results in a sharp increase in the degree of freedom for the RVE model’s grid, subsequently reducing the model’s computational efficiency. Hence, to enhance the computational efficiency, coarser grids were selected within the CNTs domain. The material parameters are shown in Table 1.

Figure 3

The FEM simulation of the CNTs/PDMS composite. (A) Tunnel effect; (B) boundary conditions; (C) distributions of electrical potential under different strains.

Table 1

Material parameters of the CNTs/PDMS composite

The resistance R of the CNTs/PDMS composite is calculated according to Eqs. (2) and (3).

$\begin{array}{cr} I = \int \int_{S} J n d s = \int \int_{S} σ \nabla φ n d s = \sum_{i = 1}^{N s} σ i \frac{\partial φ}{\partial n} S i , \end{array}$ (2)

$\begin{array}{cr} R = \frac{U}{I}, \end{array}$ (3)

where n is the mesh number of the voltage applied interface, which is the upper face in this work, J is the current density, S_i is the area of the mesh-I, U refers to the voltage value, and I denotes the current value. The electrical potential distributions of the CNTs/PDMS composite under different strains are shown in Figure 3C.

Machine learning

The architecture of AGAT is displayed in Figure 4. The overall architecture comprises the DF module and the GAT module. The embedded DF module is formed using an ANN to predict the Young’s modulus of an individual CNT and expand the 4-node inputs to 5-node inputs. Subsequently, the expanded input undergoes five layers of graph attention. In each attention layer, an 8-head attention mechanism is employed to update the information associated with each vertex by aggregating information from adjacent vertices with specific weights. The resulting outputs then pass through another graph attention layer with a single-head attention to obtain the final predictive values.

Figure 4

The architecture of AGAT. (A) The DF module constructed using ANN networks for predicting the Young’s modulus of an individual CNT; (B) integration of the value generated by the DF module with the original input; (C) configuration of the GAT with initial five layers of GAT convolutions, each featuring an 8-head attention mechanism, and a final layer with a single-head attention mechanism.

As shown in Figure 4A, the embedded DF module, representing the microscale CNT property, is built with the ANN. In AGAT, the CNTs/PDMS composite sample can be represented by an input H= (h₀, h₁,…, h_n−1) (as shown in Figure 4B). Two vertexes out of H, assuming (h₂, h₃), were initially sent to the DF module for predicting the Young’s modulus of a single CNT, denoted by h_n. The DF module is composed of an input layer, five hidden layers, and an output layer, each of which is denoted by Zⁱ. The input layer consists of 2 nodes with the normalized values, the hidden layers both consist of 8, 32, 64, 64, and 32 nodes, respectively, and the output layer has 1 node with the predicted value. The process of transmitting information among nodes is presented in Eq. (4).

$\begin{array}{cr} Z^{i} = (w^{i} Z^{i - 1} + b^{i}) f^{i}, \end{array}$ (4)

where Zⁱ denotes the node value of the i-th layer, wⁱ and bⁱ represent the weights matrix and bias corresponding to each layer, respectively. fⁱ refers to the activation function, which utilizes the ReLu function in this work. Since this module is aimed to cope with the regression issue, no activation function is employed on the output layer.

The output h_n was subsequently integrated with the original input H to yield the updated sequence $H^{'}$ = (h₀, h₁, … , h_n). The core framework employed in AGAT is the GAT, which leverage the attention mechanism to compute weights between each vertex and its neighbors during the message passing phase. As illustrated in Figure 4C, within this phase, the information associated with each vertex is aggregated with its adjacent vertices and connected edges in an attention strategy, consequently resulting in updated information. This iterative process continues until a definitive representation for each vertex is achieved. In the message passing approach, the update of the vertex $h_{i}^{t}$ is presented in Eq. (5).

$\begin{array}{cr} h_{i}^{t + 1} = U (h^{t}_{i}, {h_{j}, e_{j i}}), \end{array}$ (5)

where U is the update function, $h_{i}^{t + 1}$ is the updated information of the vertex, h_j refers to the neighboring vertices and e_ij represents the edges connecting the neighbors to the vertex h_i.

In the framework of AGAT, the update function was derived based on the multi-head attention mechanism. Firstly, the importance of a neighboring vertex h_j was learnt and the attention score was calculated according to Eq. (6)

$\begin{array}{cr} \partial_{i j} = {softmax}_{j} (a (w h_{j}, w h_{i})) = \frac{exp (a (w h_{j}, w h_{i}))}{\sum_{j \in N_{i}} exp (a (w h_{j}, w h_{i}))}, \end{array}$ (6)

where w is the trainable weight, a denotes the self-attention calculation, h_i represents the node information in the graph, h_j refers to a neighboring node information of h_i, and N_i represents the set of neighbors of node i. The softmax function used here is for normalizing the attention scores. The AGAT employs the multi-head attention mechanism, and thus the aggregated information of node i, denoted as ${h^{'}}_{i}$ , is derived in Eq. (7).

$\begin{array}{cr} {h^{'}}_{i} = ‖_{k = 1}^{K} σ (\sum_{j \in N_{i}} \partial_{i j}^{k} w^{k} h_{j}), \end{array}$ (7)

where $σ$ refers to the activation function, which is LeakyReLU in this work, and k denotes the number of dependent attention mechanism. Five hidden layers of GAT convolutions were utilized in the AGAT, each consisting of an 8-head attention mechanism. After all vertices have been aggregated, the updated sequence information, denoted as $H^{″} = ({h^{'}}_{0}, {h^{'}}_{1}, ..., {h^{'}}_{n})$ was sent into the final layer of AGAT. In this layer, a single-head attention mechanism was employed to calculate predictive values.

RESULTS

Data collection, metrics, and cross-validation

To verify the effectiveness of the proposed model, in this section, the AGAT model was trained on the collected dataset. As shown in Figure 5A, the ML model provides an inexpensive relationship between input parameter, denoted as X_p = [x_p¹, x_p², x_p³, x_p⁴] and representing the structure (specifically the CNT radius, the CNT length, the volume ratio, and the CNT quantity) of the p-th sample, and the corresponding resistance response as a 6-dimensinal vector R_p = [r_p¹, r_p², r_p³, r_p⁴, r_p⁵, r_p⁶], consisting of electrical resistance values at 0%, 4%, 8%, 12%, 16%, and 20% strains, respectively. In the data collection process, 63 samples from the open Refs. [14,15,17,35,36] and generated DFT simulations (see Table S1) were gathered to train the DF module in the AGAT model which serves for predicting the Young’s modulus of the single CNT from its length and radius. Afterwards, the Young’s modulus was passed to COMSOL software before the FEM simulation. To this end, 230 numerically generated resistance responses were obtained as the outputs to train the remaining parameters of the AGAT model (see Table S2).

Figure 5

(A) The structure parameter vector X_p = [x_p¹, x_p², x_p³, x_p⁴] is fed to the AGAT model to predict the corresponding electrical resistance vector R_p = [r_p¹, r_p², r_p³, r_p⁴, r_p⁵, r_p⁶]; (B) the predictive performance of the AGAT model.

To address the accuracy of the proposed AGAT model, various metrics including the coefficient of determination (R²), the mean absolute error (MAE), and the mean squared error (MSE) were adopted to evaluate the performance of the trained model, which is calculated according to Eqs. (8)‒(10).

$\begin{array}{cr} R^{2} = 1 - \sum_{i = 1}^{N} \sum_{k = 1}^{6} {(r_{i}^{k} - {\hat{r}}_{i}^{k})}^{2} / \sum_{i = 1}^{N} \sum_{k = 1}^{6} {(r_{i}^{k} - {\bar{r}}_{i}^{k})}^{2}, \end{array}$ (8)

$\begin{array}{cr} MSE = \frac{1}{6 N} \sum_{i = 1}^{N} \sum_{k = 1}^{6} {(r_{i}^{k} - {\hat{r}}_{i}^{k})}^{2}, \end{array}$ (9)

$\begin{array}{cr} MAE = \frac{1}{6 N} \sum_{i = 1}^{N} \sum_{k = 1}^{6} | r_{i}^{k} - {\hat{r}}_{i}^{k} |, \end{array}$ (10)

where N represents the number of samples, r_i indicates the target output, ${\hat{r}}_{i}$ denotes the predicted output, and $\bar{r}$ refers to the average value of the target outputs.

To achieve robust outcomes, the above criteria were derived through the 5-fold cross-validation. In each iteration of the 5-fold cross-validation, the dataset was spilt into five distinct subsets, with four of them used for training while the remaining subset served as the testing data. The average value obtained from each iteration was subsequently regarded as the final accuracy of the model.

Hyper-parameter tunning

The hyper-parameters play a crucial role in determining the accuracy, robustness, and overall capabilities of the ML model. Hence, in the pursuit of optimizing the performance of the proposed model, extensive experiments were conducted to examine the impact of the hyper-parameters on the model’s efficacy (see Tables S3‒S5). The mean MAE, MSE, and R² values are presented in Tables 2 and 3, where these values are intricately tied to the varying quantities of hidden features embedded within each layer, the count of attention mechanism heads, and the training batch size employed, respectively. It should be noted that the MAE and MSE values were calculated on the normalized data.

Table 2

Influence of the hidden feature number and heads of attention mechanism on the model performance

Table 3

Influence of the training batch size on the model performance

As shown in Table 2, the R² value exhibits a declining trend with the increase in the count of hidden features. At the same time, the MSE value reaches its lowest point with 0.0060 when the model has precisely 16 hidden features. This observation can be attributed to the constrained set of six input attributes and a sample size of 230 instances. When dealing with a smaller dataset, networks with fewer hidden features tend to perform better on the testing data. With this optimal hidden feature number, the focus then shifts to the number of attention mechanism heads used. The findings from Table 2 reveal that the MSE value is minimized when employing an 8-head attention mechanism, indicating its optimal performance. Correspondingly, with the same configuration, a peak R² value of 0.7852 is observed.

Table 3 illustrates that the MAE, MSE, and R² value exhibit minimal fluctuations as the training batch size reaches 16 or exceeds it. Notably, these metrics demonstrate superior performance when the batch size is set to 16. Specifically, the MSE value remains constant at 0.0060 for batch sizes of 16, 32, 64, and 128. Additionally, the R² value shows little variance, with the largest value of 0.7883 occurring at a batch size of 16.

Model performance

A two-stage training method was utilized in this study, wherein the DF module was initially trained independently based on the collected DFT data for predicting the Young’s modulus of the single CNT. Subsequently, the remaining parameters of the AGAT model were trained on 230 numerically generated data points. Consequently, the predictive accuracies of the DF module and the overall AGAT were both evaluated. The DF module exhibited an impressive R² value of 0.93, indicating its ability to accurately predict the Young’s modulus. The predictive accuracy of the overall AGAT model concerning the electrical resistance of the CNTs/PDMS composite under various strain levels (0%, 4%, 8%, 12%, 16%, and 20%) is presented in Figure 5B. The figure visually represents the model’s performance in predicting each sequential data point within the series R_p = [r_p¹, r_p², r_p³, r_p⁴, r_p⁵, r_p⁶]. Notably, The R² values corresponding to these six outputs all surpass 0.79, thereby implying an acceptable level of prediction accuracy. In addition, the R² value exhibits small fluctuation among the six sequential data points. This consistent behavior reinforces the notion of the model’s efficacy in reliably predicting electrical resistances under varying strain conditions. The results demonstrate that the AGAT model can effectively predict the electrical resistances under different strains.

Ablation study

To gain deeper insights into the architecture of the AGAT model, we conducted an ablation study encompassing two distinct experiments. The initial experiment aimed to assess the effect of the reduction in network layers. Table 4 presents a quantitative comparison across varying quantities of hidden layers (from the 2nd row to 6th row). Table 4 reveals the significant impact of the network layers on the model performance. Notably, when network layers are diminished, a marked decrement in performance is observed. This decrement is especially pronounced in the model comprising only a singular layer, which is considered inefficient due to its negative R² value. Conversely, as the count of network layers increases, a conspicuous increase in R² values is observed, accompanied by simultaneous reductions in MAE and MSE metrics. However, this trend slightly continues from four layers to five layers. The observations derived from this experiment suggest that the number of network layers is a critical factor contributing to the AGAT model’s performance. An insufficient number of layers yield unsatisfactory performance, while the integration of additional layers enhances the model’s predictive capabilities. However, it is worth noting that further improvement tapers off after surpassing a certain layer threshold.

Table 4

Ablation study on the hidden layer variations and the DF module in the AGAT model

To examine the significance and enhancements offered by the DF module designed to bridge micro-scale calculations to macro-scale simulations, the second ablation study was conducted. This study involved the removal of the DF module from the model architecture. The detailed performance of the model without DF module is displayed in Table 4 (the 1st row). Compared with the metrics in the 6th row, it is evident that the exclusion of the DF module leads to increases in both the MAE and MSE values, rising from 0.0543 to 0.0584 and from 0.0060 to 0.0067, respectively. Simultaneously, the R² value experiences a reduction from 0.7883 to 0.7548. These outcomes underscore that the utilization of the AGAT multiscale approach outperforms the original GAT model, which is solely trained based on the single FEM data sources. This highlights the added value of the DF module in enhancing the model’s predictive capabilities across multiple scales.

DISCUSSION

This study presents a novel approach, designated as the AGAT model, for predicting the electrical response of the CNTs/PDMS composite using a GAT-based ML model. Within a specified meso-structure, the trained AGAT model can rapidly and directly predict the sensing behavior of the CNTs/PDMS composite without conducting the physics-based multi-scale simulations. Initially, the multi-scale framework starts with the DFT calculation and the FEM simulation to generate datasets. Based on the collected datasets, the AGAT model is trained with meso-structural characteristics of CNTs/PDMS composites as inputs, namely the CNT radius, the CNT length, the volume ratio, and the CNT quantity. The resulting electrical resistance sequence recorded at strain levels of 0%, 4%, 8%, 12%, 16%, and 20%, constitutes the output of the model. To assess the model’s efficacy, three metrics including the R², MAE, and MSE are employed. By performing the 5-fold cross validation, the R² values of the six output targets all exceed 0.79, underscoring a good correspondence between the predicted values and ground truth. Furthermore, insights from the ablation study indicate that the optimal network architecture consists of five hidden layers, with the integration of the DF module playing a role in connecting multi-scale simulations.

CNTs/PDMS composites show great potential in electronic devices owing to the CNTs’ interlaced networking structure which is changeable in response to external forces. By employing the proposed AGAT model, the high-efficiency exploration of the optimal structure influencing the sensing property could be achieved. While in this study we have focused on the electrical resistance-strain response of CNTs/PDMS composites, the satisfactory predictions made by AGAT also make it promising for application on other kinds of materials, especially nanocomposites. The nanocomposites consist of multiple phases where at least one, two or three dimensions are in the nanometer range. The reinforcing material is usually in a dispersed phase, while the matrix material is in a continuous phase. Hence, the material properties of each phase, together with their structural assembly would both significantly influence the final property. In this case, multiscale simulations play a crucial role by utilizing microscale calculations to determine nano-scale material properties and macroscale simulations to optimize the material structure. The proposed ML-enabled multi-scale strategy provides a platform for accelerating multiscale simulations from the micro-scale domain to the macro-scale domain. In this multi-scale framework, the embedding of a well-designed module that acts as a vital bridge between the micro-scale and the larger scale facilitates the training of the ML model through the hierarchical multiscale simulation, thus endowing it with the ability to efficiently predict material properties.

However, a challenge resides in the scale disparity that characterizes multiscale simulations. For example, in the investigation of CNTs/PDMS composites, current literature on CNT property calculations using DFT reveal nanotube lengths on the order of several tens of nanometers. In contrast, in literature on finite element analyses at meso- or macro- scales, CNTs within CNTs/PDMS composites can extend to lengths of several thousands of nanometers. To overcome this limitation, the present study conducted various DFT simulations focusing on CNTs with lengths surpassing 1000 nm. However, the extended CNT structures entail a considerable computational burden due to their sizable unit cells (over 10,000 atoms of a single CNT with 15 nm in radius and 1000 nm in length), making it extremely expensive to accumulate a sufficient number of data samples. Another challenge lies in the potential errors introduced by the simulation-based calculations. In this study, the ML model was established mainly using simulation data, which could lead to non-negligible mismatches when compared with experimental measurements. However, acquiring sufficient experimental data poses a grand challenge which can be hardly realized. Hence, efforts should be made to enhance the accuracy of the ML model by leveraging these large, comparatively “inexpensive” simulation datasets alongside the limited, expensive experimental datasets, using transfer learning, pre-trained models, or other relevant techniques. The third challenge emerges in the development of optimal sampling algorithm, demanding an efficient strategy to ensure the incorporation of data samples of pronounced significance. The collection of datasets for training requires the combination of simulations across micro-scale to macro-scale, where the hierarchical approach is often preferred over simultaneous execution, leading to a numerous time demand. Consequently, strategies for data sampling need further exploration to ascertain the minimal requisite number of multiscale simulations while preserving the uncompromised performance of the ML model.

Funding

This work was supported by the National Key R&D Program of China (2022ZD0117501), and the National Natural Science Foundation of China (62201441).

Author contributions

L.Y. and X.W designed the research and analyzed the data. X.W. supervised the project. C.Z. and J.C. conducted the FEM simulation. Z.S. and H.G. collected the FEM data. H.D. collected the micro-scale calculation data. M.Z. performed the hyper-parameter tunning experiments. L.Y. and X.W. co-wrote the manuscript. All authors contributed to discussions.

Conflict of interest

The authors declare no conflict of interest.

Supplementary information

The supporting information is available online at https://doi.org/10.1360/nso/20230055. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.

References

Zhu B, Wang H, Leow WR, et al. Silk fibroin for flexible electronic devices. Adv Mater 2016; 28: 4250-4265. [Article] [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Yang H, Li J, Lim KZ, et al. Automatic strain sensor design via active learning and data augmentation for soft machines. Nat Mach Intell 2022; 4: 84-94. [Article] [Google Scholar]
Cullinan MA, Culpepper ML. Carbon nanotubes as piezoresistive microelectromechanical sensors: Theory and experiment. Phys Rev B 2010; 82: 115428. [Article] [NASA ADS] [CrossRef] [Google Scholar]
Grabowski K, Zbyrad P, Uhl T, et al. Multiscale electro-mechanical modeling of carbon nanotube composites. Comput Mater Sci 2017; 135: 169-180. [Article] [CrossRef] [Google Scholar]
Huang J, Yang X, Liu J, et al. Vibration monitoring based on flexible multi-walled carbon nanotube/polydimethylsiloxane film sensor and the application on motion signal acquisition. Nanotechnology 2020; 31: 335504. [Article] [Google Scholar]
Rosle MH, Wang Z, Shiblee MNI, et al. Soft resistive tactile sensor based on CNT-PDMS-gel to estimate contact force. IEEE Sens Lett 2022; 6: 1-4. [Article] [CrossRef] [Google Scholar]
Bao WS, Meguid SA, Zhu ZH, et al. A novel approach to predict the electrical conductivity of multifunctional nanocomposites. Mech Mater 2012; 46: 129-138. [Article] [NASA ADS] [CrossRef] [Google Scholar]
Hu N, Karube Y, Yan C, et al. Tunneling effect in a polymer/carbon nanotube nanocomposite strain sensor. Acta Mater 2008; 56: 2929-2936. [Article] [CrossRef] [Google Scholar]
Arora G, Pathak H. Modeling of transversely isotropic properties of CNT-polymer composites using meso-scale FEM approach. Compos Part B-Eng 2019; 166: 588-597. [Article] [CrossRef] [Google Scholar]
Zhang C, Curiel-Sosa JL, Bui TQ. Comparison of periodic mesh and free mesh on the mechanical properties prediction of 3D braided composites. Composite Struct 2017; 159: 667-676. [Article] [CrossRef] [Google Scholar]
Zhang C, Curiel-Sosa JL, Bui TQ. A novel interface constitutive model for prediction of stiffness and strength in 3D braided composites. Composite Struct 2017; 163: 32-43. [Article] [CrossRef] [Google Scholar]
Li CM, Li CY, Zhang CC, et al. Simulation on electrical conductivity of CNTs/PE composites. Adv Mater Res 2014; 1035: 408-412. [Article] [CrossRef] [Google Scholar]
Ebbesen TW, Lezec HJ, Hiura H, et al. Electrical conductivity of individual carbon nanotubes. Nature 1996; 382: 54-56. [Article] [NASA ADS] [CrossRef] [Google Scholar]
Bao WX, Zhu CC, Cui WZ. Simulation of Young’s modulus of single-walled carbon nanotubes by molecular dynamics. Phys B-Condensed Matter 2004; 352: 156-163. [Article] [Google Scholar]
Wagner C, Schuster J, Gessner T. DFT investigations of the piezoresistive effect of carbon nanotubes for sensor application. Phys Status Solidi (b) 2012; 249: 2450-2453. [Article] arxiv:1706.09621 [NASA ADS] [CrossRef] [Google Scholar]
Wei X, Liu Y, Chen Q, et al. The very-low shear modulus of multi-walled carbon nanotubes determined simultaneously with the axial Young’s modulus via in situ experiments. Adv Funct Mater 2008; 18: 1555-1562. [Article] [Google Scholar]
Peralta-Inga Z, Boyd S, Murray JS, et al. Density functional tight-binding studies of carbon nanotube structures. Struct Chem 2003; 14: 431-443. [Article] [CrossRef] [Google Scholar]
Fish J, Wagner GJ, Keten S. Mesoscopic and multiscale modelling in materials. Nat Mater 2021; 20: 774-786. [Article] [Google Scholar]
Ghaffari MA, Zhang Y, Xiao S. Molecular dynamics modeling and simulation of lubricant between sliding solids. J Micromech Mol Phys 2017; 02: 1750009. [Article] [Google Scholar]
Xiao S, Hu R, Li Z, et al. A machine-learning-enhanced hierarchical multiscale method for bridging from molecular dynamics to continua. Neural Comput Applic 2019; 32: 14359-14373. [Article] [Google Scholar]
Subramanian N, Rai A, Chattopadhyay A. Atomistically informed stochastic multiscale model to predict the behavior of carbon nanotube-enhanced nanocomposites. Carbon 2015; 94: 661-672. [Article] [NASA ADS] [CrossRef] [Google Scholar]
Jiang S, Tao J, Sewell TD, et al. Hierarchical multiscale simulations of crystalline β-octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (β-HMX): Generalized interpolation material point method simulations of brittle fracture using an elastodamage model derived from molecular dynamics. Int J Damage Mech 2017; 26: 293-313. [Article] [CrossRef] [Google Scholar]
Brunton SL, Kutz JN. Methods for data-driven multiscale model discovery for materials. J Phys Mater 2019; 2: 044002. [Article] [CrossRef] [Google Scholar]
Pattnaik P, Sharma A, Choudhary M, et al. Role of machine learning in the field of fiber reinforced polymer composites: A preliminary discussion. Mater Today-Proc 2021; 44: 4703-4708. [Article] [CrossRef] [Google Scholar]
Sun X, Yue L, Yu L, et al. Machine learning-evolutionary algorithm enabled design for 4D-printed active composite structures. Adv Funct Mater 2021; 32: 2109805. [Article] [Google Scholar]
Milad A, Hussein SH, Khekan AR, et al. Development of ensemble machine learning approaches for designing fiber-reinforced polymer composite strain prediction model. Eng Comput 2022; 38: 3625-3637. [Article] [CrossRef] [Google Scholar]
Marani A, Nehdi ML. Machine learning prediction of compressive strength for phase change materials integrated cementitious composites. Constr Build Mater 2020; 265: 120286. [Article] [CrossRef] [Google Scholar]
Yuan M, Zhao H, Xie Y, et al. Prediction of stiffness degradation based on machine learning: Axial elastic modulus of [0_m/90_n]_s composite laminates. Compos Sci Tech 2022; 218: 109186. [Article] [CrossRef] [Google Scholar]
Liu B, Vu-Bac N, Rabczuk T. A stochastic multiscale method for the prediction of the thermal conductivity of polymer nanocomposites through hybrid machine learning algorithms. Composite Struct 2021; 273: 114269. [Article] [CrossRef] [Google Scholar]
Huang JS, Liew JX, Liew KM. Data-driven machine learning approach for exploring and assessing mechanical properties of carbon nanotube-reinforced cement composites. Composite Struct 2021; 267: 113917. [Article] [CrossRef] [Google Scholar]
Le TT. Prediction of tensile strength of polymer carbon nanotube composites using practical machine learning method. J Composite Mater 2020; 55: 787-811. [Article] [Google Scholar]
Xiao S, Deierling P, Attarian S, et al. Machine learning in multiscale modeling of spatially tailored materials with microstructure uncertainties. Comput Struct 2021; 249: 106511. [Article] [CrossRef] [Google Scholar]
Matouš K, Geers MGD, Kouznetsova VG, et al. A review of predictive nonlinear theories for multiscale modeling of heterogeneous materials. J Comput Phys 2017; 330: 192-220. [Article] [CrossRef] [MathSciNet] [Google Scholar]
Hansoge NK, Huang T, Sinko R, et al. Materials by design for stiff and tough hairy nanoparticle assemblies. ACS Nano 2018; 12: 7946-7958. [Article] [CrossRef] [PubMed] [Google Scholar]
Wu Y, Huang M, Wang F, et al. Determination of the Young’s modulus of structurally defined carbon nanotubes. Nano Lett 2008; 8: 4158-4161. [Article] [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Krishnan A, Dujardin E, Ebbesen TW, et al. Young’s modulus of single-walled nanotubes. Phys Rev B 1998; 58: 14013-14019. [Article] [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1

Material parameters of the CNTs/PDMS composite

In the text

Table 2

Influence of the hidden feature number and heads of attention mechanism on the model performance

In the text

Table 3

Influence of the training batch size on the model performance

In the text

Table 4

Ablation study on the hidden layer variations and the DF module in the AGAT model

In the text

All Figures

	Figure 1 The diagram of the ML-enabled multiscale approach for predicting the sensing behavior of CNTs/PDMS composites.
In the text

	Figure 2 The flowchart of the RVE generation algorithm.
In the text

	Figure 3 The FEM simulation of the CNTs/PDMS composite. (A) Tunnel effect; (B) boundary conditions; (C) distributions of electrical potential under different strains.
In the text

Figure 4

The architecture of AGAT. (A) The DF module constructed using ANN networks for predicting the Young’s modulus of an individual CNT; (B) integration of the value generated by the DF module with the original input; (C) configuration of the GAT with initial five layers of GAT convolutions, each featuring an 8-head attention mechanism, and a final layer with a single-head attention mechanism.

In the text

	Figure 5 (A) The structure parameter vector X_p = [x_p¹, x_p², x_p³, x_p⁴] is fed to the AGAT model to predict the corresponding electrical resistance vector R_p = [r_p¹, r_p², r_p³, r_p⁴, r_p⁵, r_p⁶]; (B) the predictive performance of the AGAT model.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.