Non-destructive Method of the Assessment of Stone Masonry by Artificial Neural Networks

Background: In this study , a methodology based on non-destructive tests was used to characterize historical masonry and later to obtain information regarding the mechanical parameters of these elements. Due to the historical and cultural value that these buildings represent, the maintenance and rehabilitation work are important to maintain the appreciation of history. The preservation of buildings classified as historical-cultural heritage is of social interest, since they are important to the history of society. Considering the research object as a historical building, it is not recommended to use destructive investigative techniques.


INTRODUCTION
The rehabilitation of historic buildings is considered an important step in the preservation of historical heritage, with a global or national dimension. For a well-grounded, lasting and with least possible damage rehabilitation process, non-destruc-(Ground Penetrating Radar) stands out for its efficient ability to generate images of the subsurface and the easy applicability to different situations. GPR was successfully used in investigations of historic buildings [2,3]. The sonic tests are performed with an instrumented hammer and accelerometers for receiving waves, which can be P, R or S. The sonic tests are also widely used in the field of rehabilitation [4,5]. Finally, there are dynamic tests, which provide the frequency and vibration modes of the analyzed structure. As the other methods presented, this one is also efficient in the characterization of structures [6,7].
These techniques have been applied in masonry panels, built in a lab-controlled environment, with similar characteristics to existing historical buildings. Parallel to the application of NDTs, conventional uniaxial compression tests have also been carried out in the referred double-leaf granite stone masonry panels (DSM). These data were correlated with the results of NDTs with the aid of artificial neural networks (ANN); the elastic modulus was obtained by NDT's.
The application of non-destructive tests on DSM, despite being an advanced technology, could be frustrating because the interpretation of the result is often difficult, as the masonry is a highly heterogeneous compound. The understanding of the mechanical behavior of masonry has been largely accomplished, due to the complexity of its nonlinear structural variables [8 -12]. The combination of geophysical techniques in heritage buildings' investigations ensures greater trust and complementary results.
The ambiguity inherent in the non-unicity of the response of the geophysical methods applied to masonry has led to the combined use of more than one of these methods to reduce the intrinsic uncertainty of the obtained models. The use of a synergy of NDTs, in parallel, in the same structure, as a support for the heritage preservation, brings many benefits to the analysis of the materials and the constructive elements. Through the analysis of specific parameters' variations in each method, it is possible to obtain peculiar characteristics of the analyzed materials.
Artificial Neuronal Networks (ANN) are used as an option in this work since they are an efficient tool for correlating the results previously presented. ANNs are computational tools composed of interconnected processing units (artificial neurons) capable of solving complex problems in several areas of knowledge. They are based on the human neurological system behavior and therefore, can develop the ability to learn and store information, as well as recognize and classify patterns.
The networks have been successfully used in several works in the field of engineering [13 -17]. ANNs in mechanical characterization analysis have also been used with success, as modeling compressive strength and failure criterion on the behavior of anisotropic materials such as masonry [18], could be used to predict the main cutting force component and the mean surface roughness during turning of tool steel [19], to predict the compressive strength of mortars [20,21], the shear capacity of concrete beams [22] and the compressive strength of self-compacting concrete [23].
For each artificial neuron, several input data are defined, which may be original data or responses from other neurons in the network. All input data are received through a connection that has weight. Similarly, neurons also have a unique excitation threshold value, which is a minimum intensity for neuron activation. For a given response to exit one neuron and enter another, there must be an excitation threshold for the output and an activation function for the input. In this way, the synapse occurs depending on the weight signal and has an inhibitory (negative) or excitatory (positive) effect [24].
In general, the ANN trained with the use of a data set that is representative of the problem domain proves to be successful in solving new problems with reasonable accuracy. It is clear that while ANNs have been used successfully in numerous engineering applications, few studies have incorporated the use of them for the approach of the mechanical behavior of masonry [24 -30].
The nntool (Neural Network Toolbox™) GUI (Graphical User Interface) neural network interface instrument, available in MatLab (Matrix Laboratory, 2013), was used to perform the data analysis since the same software contains a computational package for the use of ANN in its most diverse forms of processing. This package supports different types of networks, so it can be used for several areas of science and various types of problems.
For the construction of this ANN, the multi-layer feedforward architecture was used as well as supervised learning with a backpropagation algorithm. The multilayer ANN was configured, trained and simulated using MatLab (Matrix Laboratory, 2013), through a code developed to automate this process.
The number of neurons in the input and output layer is defined by the problem. This setting is made empirically, but some selection criteria already presented in the literature are followed, namely: adopting a number of input neurons equal to the number of problem variables (which are set by the user); and starting with a hidden layer and with a number of neurons in that layer equal to the average input and output neurons.
After the definition of a set of samples to be analyzed, which corresponds to 70% of the samples, training data are randomly selected by the Matlab function (Matrix Laboratory, 2013) nntool for the neural network training. 15% of the samples are designated as validation data, which measure the generalization of the network, providing it with data that had not been seen before. The remaining 15% of the samples, called test data or simulation, provide an independent measure of neuronal network performance in terms of error rate. Through an iterative and random process, the weights are modified until the iterations converge to acceptable errors. In the training stage, the least possible mean square root error must be checked; the test set error and the validation set error must have similar characteristics and no significant overfitting should occur by iteration. With the weights already established, the network is subjected to validation and generalization through the sim function, responsible for the simulation through ANN. By following these checks, the best validation performance point is identified. After each training, the software reports the error graphs obtained for all samples, for each division. After the analysis of these graphs, the user requests, or not, new training of ANNs.
The performance of the network must be analyzed according to the relationship between the outputs and the corresponding targets. To improve the results, the following approaches can be used: restarting weights and re-training, increasing the number of hidden neurons, increasing the number of input data, and using another algorithm. In this work, the artificial neural networks were used to correlate the data of the NDT tests with the results of the mechanical tests. In this section, these variables used for the formation of the database of entry of ANNs are presented, as well as, the preparation of these data, obtained by the NDT tests, for later use in the ANN training.
For ANNs application, the user's role is to provide data that will be used in each step, i.e. sets of information (input and output) for training and validation and the input of generalization. Inputs are organized into arrays with their respective outputs and thus, the network recognizes the existing mathematical relationships between the data. In the present work, the input data set consists of samples corresponding to the results of the non-destructive tests organized in a column on an Excel sheet that is later imported into the workspace MatLab (Matrix Laboratory, 2013).

METHODOLOGY
The investigations were carried out in the Laboratory of Seismic and Structural Engineering (LESE) of the Faculty of Engineering of the University of Porto (FEUP). Eight doubleleaf stone masonry walls (DSM) were studied ( Fig. 1) divided into four types ( Table 1). The sequence of the work in the laboratory was performed in real scale masonry, using the parameters and techniques defined by the bibliography [31,32]. The eight DSM walls [32] are composed of a stone from the north of Portugal and a mortar consisting of hydrated lime, gravel, and water. The vertical and horizontal joints were considered approximately 3 cm thick. The masonry walls are made of granite blocks coarsely regulated, and a non-cohesive filler (small stone fragments bound with traditional lime mortar). The granite blocks used were collected from old masonry buildings located in the north of Portugal and mortars, composed of lime and clay (1:3 ratio). The masonry specimens were built by professionals under controlled laboratory conditions, idealized and constructed to be representative of the traditional typologies of masonry construction in the Mediterranean. The eight masonry walls are 0.90 m long, 0.55 m thick and 1.75 m high, leading to a slenderness rate h/t of 3.18 and a volume of 0.87 m 3 each. Table 2 presents an overview of the tests performed and the possible response variables used as input data for the neural networks. For the use of the GPR test ( Fig. 2) as input data from the network, the variable chosen was the amplitude, represented by a value obtained after RMS (root-mean-square). The dynamic tests are represented by the values of the natural frequencies. The sonic tests will only be represented by the velocity values obtained with the indirect test configuration. The acronym VU represents unique values for the entire wall and VC represents specific values to each zone. Obtaining these values will be further detailed in the following subsection.  It is important to note that the selected input data do not necessarily have to be correlated with each other, for example, the velocity values of the sonic wave propagation do not correlate with the values of the electromagnetic wave amplitudes. The purpose of ANN is to correlate these input parameters with the mechanical test data, namely the elastic modulus, so there is no need to correlate the input data with each other, but rather the existence of a correlation between the input data with the output data.
Information defined as input data should be standardized, since the discrepancy between the values of the information may result in the inadequate performance of the neural networks. For a correct training of the network, the database should cover as many possible scenarios of the structures, so that the network can handle the cases that may happen [17]. Table 3 presents an overview of the responses obtained with the uniaxial compression test that is used as ANN output data. The variable output is set to E 2 (global tangent modulus panel -GPa -) resulting from compression testing. This variable was defined based on the slope of the graphs, compressive stress (σ) x Strains (ε), obtained by the results of the uniaxial compression tests (Fig. 3), for stress corresponding to the range of 20 to 40% of the maximum stress value (σ max ) applied to each wall. The architecture of ANNs was defined based on six variables: geometric characteristics (presence of brakes and face characteristic -regular and irregular -), results of the dynamic characterization (vibration frequencies in the X and Y direction), wave velocity of indirect sonic tests (Ac1, Ac2 or Ac3) and the RMS values of the amplitudes of the GPR test. Thus, we considered one input layer with several neurons compatible with the number of input data, one hidden layer and one layer of output with one neuron. The network consists of 32 input samples (8 walls with 3 zones each, plus the average of each wall), with 192 data at the input base.
The data used must characterize the actual situation of the structure because they are strictly mathematical techniques so that the performance of the processing depends on the correct supply of data. Data normalization in the database (dynamic test results, sonic tests and GPR) is also advisable since the disparity between values can result in the detuned performance of the networks. This normalization was carried out by dividing all the values of each parameter by its identified maximum. In Eq. (2), N var,pe is number 6 because it represents the geometric characteristic (cross-block presence and irregular face presence); natural frequencies (x) and (y) from the dynamic tests; wave velocities from the indirect sonic tests (Ac3) and the amplitude from GPR. N atr is number 32, because the training was with 32 samples (eight DSM plus with the average for each of the three zones). The use of 5,10,15,20,25 and 30 neurons in the middle layer of the networks (N nci ) was tested.

(2)
The development and mathematical details of the implementation of ANNs can be seen in other reference works [33,34], which are not debated here. The parameters used for the training of ANNs are summarized in Table 4. After defining the ANN database, the next step is to perform training of the networks to achieve the most efficient tool. The efficiency of the trained network is defined in terms of the lowest error (%) calculated for each sample (Eq. 3).

(3)
Where: Target-are the values of E 2 ; Output-the values provided by the training.
Network training consists of processing this data set (input and output) so that the tool can establish a mathematical relationship between them. Araújo (2017) suggests performing at least three runs with each test; in this way, the possibility of overfitting is reduced, and the convergence of the results is verified. The parameters that minimize the maximum relative error (Eq. 3) of the training data can be defined as follows: Data type analyzed: the larger the information in the input patterns, the more the algorithm establishes valid relations between input and output patterns; Network parameters (number of neurons): the number of neurons has a decisive role in network processing, since the higher the number, the better the performance. However, the exaggerated increase may lead to divergence of results.

Data Preparation for ANN's
In this section, we present the methodology of organization of the database for six input variables of ANNs, as well as the adequacy of the responses of the NDTs to integrate into this base. It is worth mentioning that the responses of the dynamic characterization test did not suffer adequations, since their numerical values, which correspond to the vibration frequencies of the structure (1 st and 2 nd frequency), were thus included in the input database of ANN's.
The input data were grouped in the matrix form. For the training and validation stages of networks with supervised learning, in the first phase, the input data are supplied together with their respective outputs, and, in this way, the network identifies the mathematical relationships between the information. In the second phase, the network parameters for training are defined, which include the type of algorithm, the number of neurons, the activation functions, the learning rate and the number of iterations. The algorithm tries to estimate the errors of this connection and, if acceptable, it proceeds to the validation phase, which is intended to evaluate the behavior of the trained network, using a set of unpublished data for the network (Araújo 2017).
It is important to emphasize that ANN is a tool developed to work with many samples in its database. This database is divided randomly as described above. In the last two phases (validation and testing), 15% of the data in the present study must correspond to a minimum of three samples, enough for the composition of an error chart. Therefore, a minimum set of 20 samples is required on this basis.
The number of samples (eight DSM) must be greater than the number of input variables; otherwise, this system will have more unknown data than equations for its resolution. Combination with only eight DSM walls is small to map the function implicit in this problem. To optimize the network result, there are two options: to reduce the size of the problem by decreasing the number of variables or to increase the number of samples.
Another issue is related to the difficulty in obtaining a large enough amount of experimental data capable of adequately training the ANN. It is obvious that producing too many samples is problematic and costly for sample production and actual measurement; during this period, it must be stored, and this requires specific space with the subsequent cost. An option was given to increase the number of samples, which should be at least triplicate. According to the configuration of the NDTs, three vertical zones were defined for characterization of each part of the DSM (Fig. 4). The input data refers to the geometric characteristic (crossblock and irregular face) represented by a number. The DSM zone with no cross-block is represented as 0, one cross-block as 1 and two cross-block as 2, and also the irregular face presence is represented as 0 and its absence as 1.
The output data refer to the results of the uniaxial compression tests. According to the instrument used for the (%) = − × 100 1 st 2 nd 3 rd uniaxial compression test, we have LVDTs at both ends of each face. The first zone corresponds to the elastic modulus obtained by the graph referring to the mean of the LVDTs positioned in the left lateral zone (face A and B); the second zone corresponds to the elastic modulus obtained by the graph referring to the average of the 4 readings of the global LVDTs, since there is no instrumentation referring to this region during the test, and the 3rd zone corresponds to the mean of the LVDTs positioned in the right side zone (face A and B). The input variable for the dynamic test results is repeated 3 times, according to the characteristics of each vertical zone.

Sonic Tests Parameters
The results of the sonic tests, which were used as input data for the network, are those related to the indirect configuration. In fact, the tests with the direct configuration were not considered because they give results that characterize the walls punctually, and in the direction perpendicular to the application of the loads, reason why a good correlation with the module of deformability is not expected. It can be said that, for this type of masonry, the results of the direct sonic test, to some extent, depend on the compressive stress state in the masonry, but can be used to determine other characteristics of the masonry, such as the location of voids, joints and deterioration [35]. In this work, the results of the direct sonic tests were, however, important to frame and validate the results obtained by the indirect tests.
The accelerometer Ac3 (farthest) has, in the path made by the wave, 90 cm distance between the receiver and the emitter, thus involving an interaction with a major part of the wall structure. For this reason, its result was considered more characteristic. The accelerometer Ac2 is intermediate and Ac1 is the one with the shortest trajectory because it is closer to the emitter, involving less strains of masonry crossed by the wave, which is why the corresponding results are used here, but in a second sequence (Fig. 5).

GPR Tests Parameters.
The GPR test offers the radargrams as a response, from which data can be extracted related to the propagation velocity of the electromagnetic wave and the amplitude variations in the medium. This varies from point to point within the same radargram, so this characteristic was defined to numerically represent the results of this assay for ANNs. Based on the three zones defined for the walls (Fig. 4), the numerical data concerning the distribution of the amplitudes in the radargrams, referring to the XZ plane, according to Table 5, were exported.

Fig. (5). Positions for obtaining indirect sonic tests.
The data of the radargrams chosen are exported as a numerical matrix, ie the values of the amplitudes vary according to the length and depth of each wall. These data are processed in MatLab software (Matrix Laboratory, 2013), with the RMS (root-mean-square) function. The processing sequence was as follows: first, an RMS is calculated for each column, and then an RMS is obtained with the values taken from all columns. This unique value, obtained at the end of the routine, is used to represent each zone of each wall as an input variable in the network.
To use the data resulting from the GPR tests as input data in the neural networks, it was necessary to define a methodology for converting radargrams into simple numerical data. In this way, these converted data represent the necessary variations between the samples for the ANN characterization. A characteristic behavior identified for each situation is thus obtained, so that a specific pattern can be verified. Table 6 presents the results obtained with the RMS for the amplitude data of each radar obtained, distributed according to typologies. The RMS values are related to the amplitude values of the radargrams, i.e., higher amplitudes are equivalent to higher RMS values. The amplitudes of the radargrams are influenced by the attenuations and reflections related to the medium crossed in the subsurface, thus, smaller amplitudes indicate bigger attenuations and smaller reflections. From the analysis of Table  6, we identified a variation pattern for the 2 nd zone. The values of this zonewere for all the walls except for PP1, being smaller than the 1 st and 3 rd zone. As described, a lower RMS value indicates higher attenuations and smaller reflections. For this work, the smaller values of amplitudes can be attributed to smaller variations of the medium (fewer reflections), thus suggesting greater structural regularity. This is justified for walls PP5, PP2 and PP6 because in this second zone, there is an indication of the presence of the brakes, which offer greater structural stability to the walls.
The walls that have two leaves with the regular composition of the stones (PP1, PP2, PP5 and PP6) also present the lowest values of RMS, except for PP1. This may be related to the fact that they present greater structural stability than others, which have irregular leaflets (PP3, PP4, PP7 and PP8), also indicating greater regularity of the data.

Analysis Using Ann
In Table 7, the database used in the ANN analysis without normalization is shown. The six input parameters (geometric characteristics -cross-block presence and irregular face presence -; natural frequencies -x and y -from the dynamic tests; wave velocities from the indirect sonic tests -Ac3 -and the amplitude from GPR) and the output parameter (E 2 ) are listed as previously described.
Varying combinations of DSM characteristics define the input patterns. Random combinations (R 1, R 2, R 3, R 4, R 5, R 6, R 7, R 8 and R 9) and six combinations according to the characteristics (RC, RCG, RCD, RES, RGPR and RCGF) were considered. As RC is a complete network, RCG is composed of the geometric characteristics (the presence and quantity of brakes in the masonry and the distinction between regular and irregular faces); RCD is composed of the dynamic characteristics; RCGF is composed of the characteristics obtained by the geophysical tests; RES is composed based on the results of the indirect sonic tests and RGPR is composed of the values of the amplitudes obtained by the GPR test. The activation function used between the layers, constant in the toolbox, was the Hyperbolic Tangent (Tansig) [18,23,21,20]. The number of neurons in the hidden layer was also different (5, 10, 15, 20, 25 and 30 neurons). The characteristics of these networks are presented in Table 8.
Random combination networks are shown in Table 9, with net 3, 2 and 1 referring to the use of the input data of the Ac3, Ac2 and Ac1 accelerometer sonic tests, respectively. All networks were trained with a change in the number of neurons in the hidden layer (5, 10, 15, 20, 25 and 30), but the networks with end 2 and 1 referring to the use of the Ac2 and Ac1 sonic data, respectively, were trained with alterations of 15 and 30 neurons, which will be presented later. The number of the neurons for the networks with end 2 and 1 were selected after several trials.
The results of the training of the networks are presented below. It is worth mentioning that the training results of the networks with the presence of input data from the sonic tests, namely Ac2 and Ac1, are presented together at the end of the section. For the artificial neural networks for which the results are presented, the time required for calculation as a function of the number of selected neurons did not change significantly and is always less than one minute.

RESULTS OF ANN LEARNING
The RC3 (full network) training was performed in 32 samples; these data are comprised of the three zones plus the average of the three zones of each wall. After defining the number of samples, RC3 training was performed. The graph of Fig. (6) shows the error (%) obtained for each sample, in each training done with different numbers of neurons. Table 10 presents the training results in terms of the maximum relative error obtained (%) for each number of neurons used.
Although these error values are still significantly large (28%) for training, they are understandable. Given that the network must unveil an implicit behavior of a given phenomenon, if this phenomenon contains many variabilities, ANN will repeat this variability. That is, if there is a variability of up to 47% implied in the input data, then the network response with a 28% error is coherent. In addition, the RC3 network with 20 neurons reached a maximum error of 10%, which made its use feasible.
The results of the training of five networks with a combination of input data by types of tests (RCG, RCD, RCGF3, RES3 and RGPR), with variations in the number of neurons present in the hidden layer, are presented. The RCG is the network composed of the geometric characteristics, namely the presence and quantity of masonry brakes and the distinction between regular and irregular faces. The RCD is the network trained with input data regarding the dynamic characteristics, namely the natural frequency values in the directions x and y. RCGF3 is the network trained with input data referring to the characteristics obtained with the geophysical tests (GPR and indirect sonic tests of Ac3), specifically the average value of the amplitude and the speed of propagation of the wave. RES3 is the network trained only with input data concerning the results of the indirect sonic assays of the Ac3, i.e., the wave propagation velocity. RGPR is the network trained only with input data referring to the values of the amplitudes obtained by the GPR test.
These trained networks RCG, RCD, RES3, RGPR and RCGF3, present high error values for all numbers of neurons, 5, 10, 15, 20, 25 and 30. After analyzing the relative error results of these, in the case of the trained networks, presented in Table 11, it is concluded that the use of this simulation tool, with this distribution of input data, is not feasible since the error rates reached are high, all above 30%. Except for RCD networks for 5, 20, 25 and 30 neurons, RGPR for 25 neurons and RCGF3 for 30 neuronspresented errors less than or equal to 30%. The networks considered viable are highlighted in green in this table.    The following are the training results of the nine randomly combined networks (R1-3, R2-3, R3, R4, R5-3, R6, R7-3, R8 and R9-3), with variations in the number of neurons present in the hidden bed. R1-3 is the trained network with input data based on the results obtained with GPR, sonic (Ac3) and geometric characteristics. R2-3 is the trained network with input data based on the results obtained with the indirect sonic tests (Ac3) and the geometric characteristics. R3 is the trained network with input data based on the results obtained with the GPR (mean amplitude values) and geometric characteristics. R4 is the trained network with input data related to the results obtained with dynamic tests (natural frequencies x and y) and geometric characteristics. R5-3 is the trained network with input data referring to the results obtained with dynamic tests (natural frequencies x and y), geometric characteristics and indirect sonics (Ac3). R6 is the trained network with input data related to the results obtained with the dynamic tests (natural frequencies x and y), geometric characteristics and the GPR (mean amplitude) tests. R7-3 is the trained network with input data based on the results obtained with dynamic and sonic tests (Ac3). R8 is the trained network with input data based on the results obtained with the dynamic tests and the GPR. R9-3 is the trained network with input data referring to the results obtained with dynamic tests, GPR and sonic (Ac3) (Fig. 7).
These trained networks R1-3, R2-3, R3, R4, R5-3, R6, R7-3, R8 and R9-3 have error values with large variations between -67% and -8%, for all numbers of neurons, 5, 10, 15, 20, 25 and 30. Among them, networks with higher errors (above 30%) are not considered for use. The R1-3 network is not indicated with 20 and 25 neurons. For the R2-3 network, only its use with 30 neurons is indicated. The R3 network is not indicated with any number of neurons. The R4 network is indicated with only 20 neurons. The R5-3 network is not indicated only with five neurons. The R6 network is indicated with 20 and 30 neurons. The R7-3 network is not indicated only with 10 neurons. The R8 network is indicated only with 10 and 30 neurons, and the R9 network is indicated for all numbers of trained neurons. The indicated networks are highlighted in green in Table 12.
After analyzing the relative error results of the trained networks presented in Table 12, it is concluded that the use of the R9-3 tool, with this input data distribution, is feasible and efficient since the error rate reached was -8%, with the use of 30 neurons.
After training of the 90 nets mentioned above, analyses were performed on the results obtained to verify which trainings obtained the highest efficiency in relation to the number of neurons, considering the error rates. The lowest error rates were obtained with architectures with 30, 20 and 15 neurons in the hidden layer of the networks. For this new training phase, with accelerometer changes (Ac1 and Ac2) in the sonic tests, 16 types of networks were defined and trained: RC1, RC2, RCGF1, RCGF2, RES1, RES2, R1-1, R1 -2, R2-1, R2-2, R5-1, R5-2, R7-1, R7-2, R9-1 and R9-2. After analyzing the behavior against the number of neurons only for networks with sonic test inputs, the best results with respect to the error rates were obtained with 15 and 30 neurons. Therefore, the following procedures were followed for variants of 15 and 30 in the number of neurons.
According to the orientation described previously, the networks R1, R2 and RES, using results of Ac1 and Ac2, obtained unsatisfactory results. These networks have in common the absence of dynamic results as input data from the network. The most efficient networks are R9-1, R7-1 and R5-1 with 30 neurons and RC2 with 15 neurons. These have in common the presence of dynamic and sonic results as input data. Table 13 presents the results that present the networks with more efficient simulation, with eras referring to values of E 2 (GPa) of less than 20%. These simulation tools can be applied more safely in the final response.
The 44 networks that presented error rates between 20% and 30% (Table 14) were considered possible tools to use, but in a secondary way. The other networks, with error rates up to 30%, were not used and were removed .