This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Predicting the black hole mass and correlations in X-ray reverberating AGN using neural networks

P. Chainakun1,2, I. Fongkaew1, S. Hancock3, A. J. Young3
1School of Physics, Institute of Science, Suranaree University of Technology, Nakhon Ratchasima 30000, Thailand
2Centre of Excellence in High Energy Physics and Astrophysics, Suranaree University of Technology, Nakhon Ratchasima 30000, Thailand
3HH Wills Physics Laboratory, Tyndall Avenue, Bristol BS8 1TL, UK
E-mail: pchainakun@g.sut.ac.th
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract

We develop neural network models to predict the black hole mass using 22 reverberating AGN samples in the XMM-Newton archive. The model features include the fractional excess variance (FvarF_{\rm var}) in 2–10 keV band, Fe-K lag amplitude, 2–10 keV photon counts and redshift. We find that the prediction accuracy of the neural network model is significantly higher than what is obtained from the traditional linear regression method. Our predicted mass can be confined within ±(2\pm(2–5) per cent of the true value, suggesting that the neural network technique is a promising and independent way to constrain the black hole mass. We also apply the model to 21 non-reverberating AGN to rule out their possibility to exhibit the lags (some have too small mass and FvarF_{\rm var}, while some have too large mass and FvarF_{\rm var} that contradict the FvarF_{\rm var}–lag–mass relation in reverberating AGN). We also simulate 3200 reverberating AGN samples using the multi-feature parameter space from the neural network model to investigate the global relations if the number of reverberating AGN increases. We find that the FvarF_{\rm var}–mass anti-correlation is likely stronger with increasing number of newly-discovered reverberating AGN. Contrarily, to maintain the lag–mass scaling relation, the tight anti-correlation between the lag and FvarF_{\rm var} must preserve. In an extreme case, the lag–mass correlation coefficient can significantly decrease and, if observed, may suggest the extended corona framework where their observed lags are more driven by the coronal property rather than geometry.

keywords:
accretion, accretion discs – black hole physics – galaxies: active – X-rays: galaxies
pubyear: 2021pagerange: Predicting the black hole mass and correlations in X-ray reverberating AGN using neural networksReferences

1 Introduction

The X-ray variability has become a powerful tool to probe the inner accretion flow around the central black holes in both active galactic nuclei (AGN) and X-ray binaries. The X-ray reverberation mapping is one of the timing analysis techniques that measures the time delays associated with the light-travel time between the X-ray photons from the direct coronal emission and back-scattered photons from the accretion disc (see Uttley et al., 2014; Cackett, Bentz, & Kara, 2021, for a review). Due to a longer distance travelled by the reflected photons, the reflection-dominated bands (the soft excess, Fe-K and Compton hump bands) lag behind the continuum-dominated bands (e.g. Fabian et al., 2009; Kara et al., 2013a, c, 2014; Zoghbi et al., 2014; Kara et al., 2019; Alston et al., 2020; Vincentelli et al., 2020). The amplitude of the lag depends on the geometry of the source, so providing us a tool to constrain the disc-corona geometry.

De Marco et al. (2013) performed a systematic look at the frequency-dependent time lags between the soft (0.31\sim 0.3–1 keV) and hard (14\sim 1–4 keV) bands in AGN and found that the amplitude of the soft lag scales with the black hole mass. Kara et al. (2016a) performed a systematic study of the X-ray reverberation lags of all Seyfert galaxies available in the XMM–Newton archive and reported a number of sources that exhibited Fe-K reverberation lags. They confirmed the lag-mass scaling relation and found the correlation between the height of the corona and the mass accretion rate of these reverberating AGN. King, Lohfink, & Kara (2017) found that the radio Eddington luminosity inversely correlates with the X-ray reflection fraction, and positively scales with the path length between the X-ray source and the accretion disc. Modelling the X-ray reverberation lags to map the extreme region near a supermassive black hole has been carried out intensively using both lamppost geometry (Wilkins & Fabian, 2013; Cackett et al., 2014; Emmanoulopoulos et al., 2014; Chainakun & Young, 2015; Chainakun, Young, & Kara, 2016; Epitropakis et al., 2016; Caballero-García et al., 2018; Ingram et al., 2019) and extended corona model (Wilkins et al., 2016; Chainakun & Young, 2017; Chainakun et al., 2019). The X-ray reverberation technique has already been applied to various scenarios such as the photon reflection off the accretion flow in the tidal disruption event (Kara et al., 2016b), the reflection from different hot-flow zones in X-ray binaries (Mahmoud, Done, & De Marco, 2019; Chainakun et al., 2021b; Kawamura et al., 2021) and the multiple scattering from the disc wind in ultraluminous X-ray sources (Luangtip et al., 2021).

Furthermore, Alston et al. (2020) suggested that the height of the X-ray corona in IRAS13224–3809 increased with the source luminosity. This is also supported by Caballero-García et al. (2020) who found significant variations in the X-ray source height from 35rg\sim 3-5r_{\rm g} when the X-ray luminosity is 1.53\sim 1.5-3 per cent of the Eddington limit, to 10rg\sim 10r_{\rm g} when the luminosity doubles. Recently, Hancock et al. (in prep) investigated the time-average and lag-frequency spectra of 20 AGN covering 121 XMM-Newton observations and separated them into 3–4 groups of similarly observed spectral states. The reflection fraction was found to be strongly correlated to the power-law photon index that suggested dynamics of the emitting region.

The X-ray reverberation features can be imprinted in the profiles of the power spectral density (PSD) that describes the variability power on different timescales. The oscillatory structures seen in the PSD of AGN can be interpreted as the reverberation signatures that relate to the geometry of the system such as the coronal height and the inclination (Papadakis et al., 2016; Emmanoulopoulos et al., 2016; Chainakun, 2019). In Chainakun et al. (2021a), we developed machine learning (ML) models, based on dictionary learning and support vector machine algorithms, to extract the X-ray reverberation signatures on the PSD profiles of AGN, and used them to predict the coronal height. The variability amplitude in light curves can be estimated using the fractional excess variance (FvarF_{\rm var}), which can be constructed by integrating the PSD between two frequencies, or from the mean and the rms amplitude of the light curve (e.g. Vaughan, Fabian, & Nandra, 2003). Recently, the FvarF_{\rm var} spectra have been used to probe the intrinsic and environmental absorption origins for the X-ray variability in AGN (Parker et al., 2021).

The potential use of the ML techniques in the X-ray reverberation analysis has been elaborated and discussed in Chainakun et al. (2021a). Here, we employ the key information of the X-ray reverberating AGN to develop a neural network model in order to predict the black hole mass. We consider the fundamental parameters that can be derived from the X-ray observations. These parameters include the Fe-K lag amplitude, the FvarF_{\rm var}, the photon counts (CC), the bolometric luminosity (LbolL_{\rm bol}) and the redshift (zz). The source height and the reflection fraction are not considered because their values are dependent on the assumed geometry and the choice of the reflection models.

The ultimate goal is to test how well the neural network model, trained using only fundamental X-ray observational parameters, can predict the central mass. The prediction accuracy obtained from the neural network model is reported and compared to what is obtained from the standard linear regression model. After that, we plot the mass distribution against the model features such as the FvarF_{\rm var} and the lag amplitude to investigate the global relations between them all simultaneously. From the parameter space constrained by the model, we can simulate more samples of reverberating AGN and investigate the correlations between their parameters. It provides hints of how the known-existing correlations will change if more reverberating AGN are discovered. The AGN data used for training the machine are presented in Section 2. The neural network algorithm and the development of the ML models to predict the black hole mass are explained in Section 3. We evaluate the models and present the results in Section 4. We discuss the optimization results and the obtained correlations in Section 5, while the conclusion is provided in Section 6.

2 AGN data

We use the XMM-Newton data previously reported and analyzed by Kara et al. (2016a) where the selected samples have 40\gtrsim 40 ks exposure time and show some variability. However, we select only the data that display the Fe-K reverberation features in the lag spectra whose lower and upper bounds of the lags can be constrained. Based on these criteria, there are 22 AGN sources in total. These AGN samples and their parameter values are listed in Table 1. The Fe-K lag amplitude, FvarF_{\rm var}, log(LbolL_{\rm bol}) and log(CC) of each AGN are average values from those of all available observations that fit the criteria. Fvar=σrms/x¯F_{\rm var}=\sigma_{\rm rms}/\bar{x} is the fractional excess variance of the data in 2–10 keV band where σrms\sigma_{\rm rms} is the rms amplitude and x¯\bar{x} is the mean count rates (Vaughan, Fabian, & Nandra, 2003). CC is the photon counts in the 2–10 keV band. LbolL_{\rm bol} is the mean bolometric luminosities calculated from the SEDs (Wang, Watarai, & Mineshige, 2004; Vasudevan & Fabian, 2007, 2009; Vasudevan et al., 2010). The masses that are estimated by the optical reverberation techniques are from the public web data base (Bentz & Katz, 2015) using the <f><f> = 4.3 (Grier et al., 2013), similar to Kara et al. (2016a).

There are also 21\sim 21 AGN samples left, that have 40\gtrsim 40 ks exposure time, show some variability but do not exhibit Fe-K reverberation features on the lag spectra. We spare them in a new data set (Table 4) for further analysis once the neural network model is obtained. In other words, while the reverberating AGN data in Table 1 are used during the training phase, the non-reverberating AGN data in Table 4 are kept unseen, completely new to the machine and are used only for the final evaluation of the model. Note that there are some inconsistencies in the report of the Fe-K response among previous literature. For example, Wilkins et al. (2021) recently found that there was a Fe-K reverberation lag caused by a flare in IZw1, while this source is still included in the non-reverberation sample here (Table 4). We, however, select to follow the standard analysis of Kara et al. (2016a) and discuss these inconsistencies later in the Discussion section.

Table 1: Observed reverberating AGN data used for training and testing the neural network model. The table includes the AGN name, black hole mass, 2-10 keV fractional excess variance, Fe-K lag amplitude, bolometric luminosities, total 2–10 keV counts and redshift. These AGN are all Fe-K reverberating AGN probed by XMM-Newton and were previously analyzed by Kara et al. (2016a). The numbers in brackets denote the references where: (1) Bian & Zhao (2003); (2) Ponti et al. (2012); (3) Agís-González et al. (2014); (4) González-Martín & Vaughan (2012); (5) Alston et al. (2015); (6) Schulz, Knake, & Schmidt-Kaler (1994); (7) Marconi et al. (2008); (8) Alston et al. (2014); (9) Malizia et al. (2008); (10) Kara et al. (2013a); (11) Kara et al. (2013c); (12) Kara et al. (2013b); (13) Zoghbi et al. (2013); (14) Kara et al. (2015); (15) Zoghbi et al. (2012); (16) Kara et al. (2014); (17) Marinucci et al. (2014). (R) indicates the optical reverberation mass estimate. (K) refers to Kara et al. (2016a).
AGN name log(M/MM/M_{\odot}) FvarF_{\rm var} Lag amplitude (s) log(LbolL_{\rm bol}) log(CC) z
1H 0707–495 6.31 (1) 0.527 47±1647\pm 16 (10) 44.43 5.158 0.0406
Ark 564 6.27 (2) 0.213 92±6592\pm 65 (11) 44.36 6.207 0.0247
ESO 362–G18 7.65 (3) 0.131 1562±6061562\pm 606 (K) 44.11 4.849 0.0124
IC 4329A 8.3 (4) 0.028 696±331696\pm 331 (K) 44.92 6.212 0.0161
IRAS 13224–3809 6.8 (4) 0.612 299±135299\pm 135 (12) 45.74 4.628 0.0658
IRAS 17020+4544 6.54 (2) 0.156 128±88128\pm 88 (K) 44.74 5.164 0.0604
MCG–5–23–16 7.92 (5) 0.074 1037±4551037\pm 455 (13) 44.30 6.681 0.0085
Mrk 335 7.23 (R) 0.177 193±98193\pm 98 (11) 45.10 5.631 0.0258
MS 22549–3712 7.0 (5) 0.100 1500±8501500\pm 850 (5) 45.09 5.000 0.0390
NGC 1365 7.6 (4) 0.234 500±120500\pm 120 (14) 43.99 5.940 0.0055
NGC 3783 7.371 (R) 0.066 172±62172\pm 62 (K) 44.28 6.072 0.0097
NGC 4051 6.13 (R) 0.400 90±3090\pm 30 (K) 43.26 5.301 0.0023
NGC 4151 7.65 (R) 0.077 880±360880\pm 360 (15) 44.01 5.899 0.0033
NGC 5506 7.4 (4) 0.097 398±252398\pm 252 (K) 44.22 6.307 0.0062
NGC 5548 7.718 (R) 0.039 311±109311\pm 109 (K) 44.79 5.700 0.0172
NGC 6860 7.6 (4) 0.070 398±252398\pm 252 (K) 43.71 5.530 0.0149
NGC 7314 6.7 (6) 0.223 77±3177\pm 31 (13) 42.98 5.937 0.0048
NGC 7469 6.956 (R) 0.078 1848±14511848\pm 1451 (K) 45.10 5.806 0.0163
PG 1211+143 7.61 (2) 0.118 1179±9801179\pm 980 (K) 46.17 4.964 0.0809
PG 1244+026 7.26 (7) 0.190 726±306726\pm 306 (16) 44.62 4.757 0.0482
REJ 1034+398 6.6 (8) 0.170 450±200450\pm 200 (K) 44.52 5.602 0.0424
SWIFT J2127.4+5654 7.18 (9) 0.137 408±127408\pm 127 (17) 44.55 6.170 0.0144

3 Methods

We train the machine using the neural network technique so that it can make an accurate prediction of the black hole mass. The flowchart illustrating the training and testing process is presented in Fig. 1. The following subsections outline step-by-step the methodology used to explore the nature of the data as well as to develop and evaluate our neural network models.

Refer to caption

Figure 1: Flowchart of our ML algorithm. The AGN data for training and testing (Table 1) are split into the training set (90 per cent) and test set (10 per cent). The neural network algorithm is used to train the machine to make an accurate prediction of the black hole mass. The new AGN data set (Table 4) is also spared for the final evaluation of the model.

3.1 Pre-processing the data

The data set used during the training and testing phase contains 22 AGN sources in total, each of which consists of five features which are the lag amplitude, FvarF_{\rm var}, log(LbolL_{\rm bol}), log(CC) and zz. We train the machine by using only the mean of the data (i.e. ignore the uncertainty from measurements), so that the model becomes as simple as possible. Since the amount of reverberating AGN sources is small, we split them randomly with the sample ratio of 90/10 per cent for training/testing. This can be done using sklearn.model_selection.train_test_split() available in scikit-learn111https://scikit-learn.org/ (Pedregosa et al., 2012).

We also carry out the test when the data for each feature are scaled to be between 0 and 1. This aims to inspect how much the scaling factor affects the performance of the model.

3.2 Correlations of the data

We analyze the Pearson and Spearman’s rank correlation coefficient (rpr_{p} and rsr_{s}, respectively) between each model feature and the black hole mass. rpr_{p} measures the linear correlation between two features resulting in a value between 1-1 and 1. rp=1r_{p}=1 and 1-1 mean perfect linear correlation and anti-correlation, respectively. On the other hand, rsr_{s} assesses the strength and direction of monotonic relationships between two features. Also, rsr_{s} can be between 1 and 1-1 (between perfect monotonic correlation and anti-correlation, respectively). The features with a very weak correlation with mass are discarded in order to save the computational time during the training phase. Due to a small amount of data, it is also better to keep the number of important features to be as small as possible to avoid data overfitting.

Throughout the paper, the trend of correlations is illustrated using the Sieve diagram that represents the relationship between the categorical variables using observed and expected frequencies under independence. It displays the structure of the data and the model (i.e. the pattern of association) that helps us visualise the relationship between the observed and expected frequencies between two variables. The data analysis and visualization is carried out using the Orange platform which is the data mining toolbox in Python (Demšar et al., 2013).

3.3 Neural network algorithm

The neural network algorithm which is employed here is sklearn.neural_network.MLPRegressor() available in scikit-learn (Pedregosa et al., 2012). The MLPRegressor() is the Multi-layer Perceptron (MLP) regressor that can optimize, iteratively, the partial derivatives of the loss function at each time step based on the choice of the activation function and solver. The MLP architecture contains a series of layers that consists of neurons and their connections, building up a neural network. The basic unit of a neural network is a neuron that takes inputs, re-processes them and produces one output. The data with nn features (i.e. nn inputs) can be written as x=[x1,x2,x3,,xn]\textbf{x}=[x_{1},x_{2},x_{3},...,x_{n}]. Within a neuron, each input is multiplied by a weight w=[w1,w2,w3,,wn]\textbf{w}=[w_{1},w_{2},w_{3},...,w_{n}]. Then, weighted inputs are added together with one bias bb and are passed through an activation function ff to obtain one output yy:

y=f(xw+b).y=f(\textbf{x}\cdot\textbf{w}+b)\;. (1)

The choice of the activation functions depends on the nature of the data. An example function commonly used is the sigmoid function that outputs only numbers between (0,1)(0,1).

A neural network is produced by combining many neurons. It can have any number of layers with any number of neurons in those layers. Any layers between the input (first) layer and output (last) layer are referred to as the hidden layer. The appropriate hidden layer size and number of neurons in each layer can be fine-tuned during the training phase.

Performance of the neural network model can be evaluated using either the mean absolute error (MAE) or the mean squared error (MSE):

MAE=1Ni=1N|yi,trueyi,pred|,{\rm MAE}=\frac{1}{N}\sum_{i=1}^{N}|y_{i,{\rm true}}-y_{i,\rm{pred}}|\;, (2)
MSE=1Ni=1N|yi,trueyi,pred|2,{\rm MSE}=\frac{1}{N}\sum_{i=1}^{N}|y_{i,{\rm true}}-y_{i,\rm{pred}}|^{2}\;, (3)

where NN is the number of AGN samples. ytruey_{\rm true} and ypredy_{\rm pred} are the true and predicted values of the black hole mass, respectively. During the training and testing phase, the error loss is estimated in the form of the loss function, LL, by taking the average overall obtained error. LL can then be written as a multivariable function of L(w1,w2,w3,,b1,b2,b3,)L(w_{1},w_{2},w_{3},...,b_{1},b_{2},b_{3},...). The change of loss when changing one of these variables (e.g. Lw1\frac{\partial L}{\partial w_{1}}) can be calculated. Therefore, the goal of training a neural network is to minimize its loss in predicting the black hole mass by finding appropriate weights and biases.

3.4 Hyperparameter optimization

Hyperparameters are the parameters that cannot be directly learned by the machine. Fine-tuning them is one of the important processes to improve the performance of the ML model. There are two key hyperparameters for the neural network algorithm which are the hidden layer size and the number of neurons in each layer. Since the number of our inputs is quite small, we begin by setting the number of hidden layers to be 1 and allow the number of neurons to be varied between 1 and 250 neurons. We increase the number of hidden layers only if the prediction accuracy using 1 layer is low. The learning rate is also a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. An adaptive learning rate algorithm is used by starting from an initial value of 0.001.

Moreover, we investigate three solvers including the standard Stochastic Gradient Descent (SGD), Adam and L-BFGS-B algorithms. Adam is an adaptive algorithm for the first-order gradient-based optimization, which is an extension to SGD. L-BFGS-B is an extension of the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm which is a type of second-order optimization algorithm that uses a limited amount of computer memory and can handle bound constraints on variables. These different solvers can provide different optimization results.

We investigate four activation functions including the identity (linear), logistic (sigmoid), hyperbolic tangent (tanh) and Rectified Linear Unit (ReLU) functions. The activation function helps the network learn complex data so the model can provide accurate predictions. It is clear from eq. 1 that a neural network without an activation function, ff, reduces to a linear regression model. A neural network with the identity activation function is also consistent with the linear regression model. If the neural network model prefers the identity activation function, it means that the model behaves like a single layer network even when we increase the number of layers to our network. This is because summing up the additional layers will still output another linear function which does not further improve the model.

4 Results and analysis

4.1 Current observed AGN data

First of all, we discretize the data into two groups which are reverberating and non-reverberating AGN samples and explore their distribution in the parameter space. Note that the non-reverberating AGN referred to the sources that the Fe-K reverberation lags cannot be robustly detected with XMM-Newton despite that they show some variability. The inconsistency in detecting the Fe-K reverberation in some sources as reported in previous literature is discussed in the Discussion section. Fig. 2 shows the Sieve diagram of the frequencies in discovering these reverberating and non-reverberating AGN regarding to their masses. The Sieve diagram compares the observed frequencies to expected frequencies under the assumption of independence. The expected frequency is proportional to the size of each rectangle (or cell), where the number of squares in each cell represents the observed frequency. Cells whose observed frequency is greater (smaller) than the expected frequency are shown in blue (red), appearing more (less) intense for more (less) deviation. The result suggests a high probability of finding the reverberating AGN if the central mass is in the range of 106.5\sim 10^{6.5}108M10^{8}M_{\odot} (blue shade in the upper row). On the other hand, the samples are dominated by non-reverberating AGN if the central mass is beyond the lower and upper bounds of that range. Especially when the mass is 108M\gtrsim 10^{8}M_{\odot}, there is a small probability of finding AGN displaying reverberation lags.

Refer to caption


Figure 2: Sieve diagram visualizing frequencies in discovering reverberating AGN (Table 1) and non-reverberating AGN (Table 4) across all masses. The area of each rectangular and the number of squares inside are proportional to the expected frequency under independence and the observed frequency, respectively. The density of shading shows the differences between the observed and expected frequency, appearing as blue and red for the positive and negative deviation, respectively. The observed frequencies are shown as percentages while the values in square brackets indicate the per cent of the expected frequencies.

Now let us focus on the reverberating AGN (22 samples in Table 1). The Sieve diagrams representing the chance in finding the Fe-K reverberating AGN having mass, FvarF_{\rm var} and lag amplitude in a specific range of model parameters are shown in Fig. 3. The pattern of association is clearly revealed. The number of actual, observed data respondents with the AGN exhibiting larger FvarF_{\rm var} while having smaller mass becomes more than what is expected under the assumption of independence. The known lag-mass scaling relation (De Marco et al., 2013; Kara et al., 2016a) is also suggested. To investigate more on the nature of mass distribution among these reverberating AGN, we discretize the samples into 5 groups based on their FvarF_{\rm var}, lags and mass. The result is shown in Fig. 4. The sources are clumped around the bottom-left portion of the lag–FvarF_{\rm var} parameter space. We can see the trend of decreasing mass with increasing FvarF_{\rm var}, and decreasing lag with increasing FvarF_{\rm var}. The yellow clump (highest average-mass group) lies in the range of 407–764 s time lags and Fvar<0.14F_{\rm var}<0.14, which is driven by the AGN IC 4329A in particular.

Refer to caption

Refer to caption

Figure 3: Sieve diagrams for the black hole mass in 22 reverberating AGN samples, dependent on FvarF_{\rm var} (top panel) and lags (bottom panel).

Refer to caption

Figure 4: Plot of the mass distribution dependent on the FvarF_{\rm var} and Fe-K lag amplitude of all reverberating AGN in Table 1. The data are discretized into 5 groups based on their FvarF_{\rm var}, lag amplitude and central mass. Each dot represents the clump of data falling into that particular range of parameters. Different colours represent different corresponding masses as specified in the figure.

Correlation coefficients between the black hole mass and each model feature are summarized in Table 2. Undoubtedly, FvarF_{\rm var} shows a strong anti-correlation with the black hole mass in both interval and ordinal scales (rp=0.636r_{p}=-0.636 and rs=0.718r_{s}=-0.718). The lag amplitude also scales with the black hole mass with rs=0.589r_{s}=0.589, as expected. Contrarily, log(LbolL_{\rm bol}) shows a very weak correlation with the mass (rs=0.023r_{s}=0.023). The very weak correlation between log(LbolL_{\rm bol}) and mass is possibly due to the measurement of LbolL_{\rm bol} itself which is likely model dependent and is easily either overestimated or underestimated based on the inferred bolometric correction factors. We omit this feature to reduce time consumption during the machine training processes. This also helps prevent the model becoming too complex and too specific to the training data. Therefore, there are four features left in our consideration for developing ML models, which are FvarF_{\rm var}, lag amplitude, log(CC) and zz.

Table 2: Pearson correlation coefficient, rpr_{p}, and Spearman’s rank correlation coefficient, rsr_{s}, between log(M/MM/M_{\odot}) and each model feature for 22 reverberating AGN samples in the XMM-Newton archive.
Features rpr_{p} rsr_{s}
FvarF_{\rm var} 0.636-0.636 0.718-0.718
Lag amplitude +0.408+0.408 +0.589+0.589
log(LbolL_{\rm bol}) +0.172+0.172 +0.023+0.023
log(CC) +0.295+0.295 +0.260+0.260
zz 0.237-0.237 0.233-0.233

4.2 Mass prediction by the model

The best neural network models obtained with different combinations of the features are presented in Table 3. Predictions made by the neural networks are much more accurate than what is obtained traditionally from the linear regression method. Note that we fix the number of hidden layers to be 1 and fine-tune the number of neurons with different solvers and activation functions. More features require more neurons because the data are more complex. We find that the best models in all cases prefer the L-BFGS-B solver. The best activation function associated with our data is tanh, except when only the FvarF_{\rm var} is the model feature. The models can make a good prediction of the mass using the FvarF_{\rm var} and the lag amplitude alone, with the R2R^{2} of 0.6459 and 0.7302, respectively. The accuracy significantly increases if both FvarF_{\rm var} and lag amplitude are used as the features (R2=0.9124R^{2}=0.9124).

Table 3: The best neural network models for different combinations of the model features. The best solver, activation function, number of hidden layers (fixed at 1) and number of neurons in each layer are presented in the second, third, fourth and fifth columns, respectively. The corresponding R2R^{2} values and the MAE are shown in the sixth and seventh columns. The values in square brackets indicate the corresponding errors obtained from the linear regression model.
Features Solver Activation Number Number R2R^{2} MAE
function of layers of neurons
FvarF_{\rm var} L-BFGS-B ReLU 1 91 0.6459 [0.3558] 0.2604 [0.4160]
Lag amplitude L-BFGS-B tanh 1 128 0.7302 [0.4529] 0.2103 [0.1545]
FvarF_{\rm var} \oplus Lag amplitude L-BFGS-B tanh 1 168 0.9124 [0.3512] 0.1077 [0.4278]
FvarF_{\rm var} \oplus Lag amplitude \oplus zz L-BFGS-B tanh 1 164 0.9637 [0.3518] 0.0719 [0.4283]
FvarF_{\rm var} \oplus Lag amplitude \oplus zz \oplus log(CC) L-BFGS-B tanh 1 249 0.9993 [0.3541] 0.0100 [0.4290]

Fig. 5 shows the scatter plots of the actual values of the black hole mass and those predicted by the neural network models presented in Table 3. For comparison, we also plot the results obtained when using the linear regression model. The linear regression model seems to overestimate the mass when M107MM\lesssim 10^{7}M_{\odot}, but underestimate the mass when M107MM\gtrsim 10^{7}M_{\odot} (red data points in Fig. 5). Increasing the number of features does not improve the accuracy of the linear regression model. On the other hand, the predicted masses from the neural network model scatter around the true values with smaller errors when more features are included (blue data points in Fig. 5). We also draw the lines representing the region in the parameter space that covers 2 and 5 per cent of the mass deviation from the true values. It can be seen that the predicted mass using the combined features of the FvarF_{\rm var} and the lag amplitude can be constrained approximately within ±5\pm 5 per cent of the true values. Using all features, we can place a constraint on the predicted mass to be ±2\pm 2 per cent of the true mass.

Refer to caption    Refer to caption

Refer to caption      Refer to caption

Figure 5: Scatter plots of the true versus predicted values of the black hole mass. The blue-cross and red-circle data points represent the results obtained from the neural network and the linear regression model, respectively (Table 3). The accuracy of the neural network model is significantly higher when using both the lag amplitude and FvarF_{\rm var} as the features of the model, and when compared with those of the linear regression model. The green solid lines show the perfect prediction lines. The region between two dashed lines and two dotted lines represents the parameter space where the deviations of the predicted black hole mass are still within 2 and 5 per cent from the true values, respectively. See text for more details.

All combined-feature models show R2>0.9R^{2}>0.9 even with 1 layer, so we do not increase the number of layers since it may result in unnecessary extreme time consuming while their accuracy cannot be significantly improved. We also investigate the case when the data of each feature are scaled to be between 0 and 1. The results are not much different to the cases when we do not scale them. The scaling factor then has a negligible effect on the efficiency of the model.

4.3 Lag and no-lag in reverberating and non-reverberating AGN

Furthermore, the FvarF_{\rm var}–lag–mass model inferred from the neural network can be extrapolated to reveal the parameter space beyond what is occupied by the already-known reverberating AGN. We then plot the parameter distribution predicted by the model that includes both FvarF_{\rm var} and lag as the features in Fig. 6, with the scattered red dots representing the current data of the observed reverberating AGN. It is clear that the AGN displaying a large FvarF_{\rm var} of 0.40.8\sim 0.4-0.8 while showing the lags of 1000\gtrsim 1000 are not yet observed (i.e. top-right region of Fig. 6). Our model predicts that the reverberating AGN belonging to this region, if exist, will have a small mass of 106.5M\lesssim 10^{6.5}M_{\odot}, and the lag-mass scaling relationship may be weaker. There is, however, no simple explanation of how a small mass can induce such a large amplitude of the lag while still maintaining large FvarF_{\rm var}. Perhaps, it is unphysical and that we still observe none of these reverberating AGN.

Refer to caption

Figure 6: Black hole masses plotted in the parameter space of our neural network model that employs both FvarF_{\rm var} and lag as the features. The observed reverberating AGN data are presented using red dots in this parameter space. Clearly, they occupy only the regime that implies the anti-correlation between the FvarF_{\rm var} and the lag. The Spearman’s rank correlation coefficients between the FvarF_{\rm var} and the lag of these observed reverberating AGN so far is rs=0.46r_{s}=-0.46.

Then, we apply the model to the AGN listed in Table 4. These AGN are completely new to the machine (i.e. kept unseen all the time during the training phase). They are non-reverberating AGN, however, so we use their mass and FvarF_{\rm var} to trace back to the lags that probably exist. We find that the model strongly suggests no observed lags for most of these sources, in agreement with the observations. The mass distribution of the non-reverberating AGN in the model parameter space is also illustrated as an example in Fig. 7. It is clear that some have too small mass and FvarF_{\rm var} (e.g. ESO 113–G010, IRAS 18325–5926, MCG–6–30–15, NGC 4395, NGC 4593 and NGC 4748) that cannot fit into the constrained parameter space for the X-ray reverberating AGN. Some do not exhibit the lags because the central mass is large and the FvarF_{\rm var} is too large, against the strong anti-correlation of the mass and FvarF_{\rm var} predicted by the model (e.g. IRAS 13349+2438, Mrk 586, Mrk 704, PKS 0558–504 and RXJ 0136.9–-3510).

Since the number of the reverberating AGN discovered so far is quite small, the ML prediction against the non-reverberating AGN samples can provide an indirect way to test the efficiency of the model. Regarding the high R2R^{2} above 0.9 and the fact that the ML model can rule out the possibility to exhibit the lags in the majority of these non-reverberating AGN which are kept unseen during the training phase, we can ensure that the constrained FvarF_{\rm var}–lag–mass relation (e.g. Fig. 6) is reliable and that the model does not overfit the data.

Table 4: Observed AGN data for final evaluation of the model. These are non-reverberating AGN that do not present clear Fe-K reverberation lags while showing variability that fits the criteria. The numbers in brackets denote the references where: (1) Ponti et al. (2012); (2) González-Martín & Vaughan (2012); (3) Iwasawa et al. (2016); (4) Papadakis et al. (2010). (R) indicates the optical reverberation mass estimate. The predicted properties of their time lags using the neural network model are presented in the final column, where ‘–’ indicates no lags that is because either the mass is too low (‘LM’) or too high (‘HM’) that contradicts the constrained FvarF_{\rm var}–lag–mass relation. The upper limits of the lags are reported if suggested by the model.
AGN name true log(M/MM/M_{\odot}) FvarF_{\rm var} Predicted lag properties
ESO 113–G010 6.74 (1) 0.159 –, LM
ESO 511–G030 8.66 (1) 0.050 –, HM
IRAS 05078+1626 7.55 (1) 0.063 \lesssim 1,300 s
IRAS 13349+2438 7.7 (2) 0.211 –, HM
IRAS 18325–5926 6.4 (3) 0.215 –, LM
IZw1 7.4 (2) 0.150 600\lesssim 600 s
MCG–02–14–009 7.13 (1) 0.126 500\lesssim 500 s
MCG–6–30–15 6.3 (1) 0.212 –, LM
Mrk 1040 7.6 (2) 0.081 \lesssim 1,300 s
Mrk 205 8.32 (1) 0.075 –, HM
Mrk 586 7.6 (2) 0.248 –, HM
Mrk 704 8.11 (1) 0.248 –, HM
Mrk 766 6.822 (R) 0.228 –, LM
Mrk 841 8.52 (1) 0.142 –, HM
NGC 3227 6.775 (R) 0.096 –, LM
NGC 3516 7.395 (R) 0.088 300\lesssim 300 s
NGC 4395 5.449 (R) 0.392 –, LM
NGC 4593 6.882 (R) 0.172 –, LM
NGC 4748 6.407 (R) 0.161 –, LM
PKS 0558–504 7.8 (4) 0.154 –, HM
RXJ 0136.9–3510 7.9 (2) 0.315 –, HM

Refer to caption

Refer to caption

Figure 7: Parameter distribution predicted by the model, as in Fig. 6, over-plotted by the data of non-reverberating AGN. Their central mass and FvarF_{\rm var} are too small (top panel) and too large (bottom panel) that cannot fit into the constrained parameter space for the reverberating AGN, suggesting no observed Fe-K lag.

4.4 Simulated new reverberating samples and parameter correlations

Based on the FvarF_{\rm var}–lag–mass parameter space constrained by the neural network model, we simulate 3200 new reverberating AGN samples to evaluate the obtained correlations. We investigate two possible cases: 1) when newly-discovered reverberating AGN samples are uniformly distributed into the constrained parameter space and 2) when newly-discovered reverberating AGN follow the current trend of the observed FvarF_{\rm var}–lag anti-correlation.

Fig. 8 represents the Sieve diagrams for the black hole mass of the simulated 3200 reverberating AGN samples covering entire regime of the FvarF_{\rm var}–lag parameter space. The model grids of FvarF_{\rm var} and lags with equal step sizes of 0.02 and 0.25 s, respectively, are used to produce a discrete uniform distribution of the samples. If the newly-discovered reverberating AGN lie uniformly in the constrained parameter regime, the FvarF_{\rm var} attribute associated with mass reveals a similar pattern as in Fig. 3, but with a higher correlation coefficient of rs=0.883r_{s}=-0.883. In the case of the lag, when associating it with mass, the lag–mass scaling relationship cannot be recovered. Intense blue rectangular at the lags of 1000\gtrsim 1000 s and M<106MM<10^{6}M_{\odot} means the number of actual data respondents to the AGN that has small mass with large lag becomes significantly more than what is expected. Interestingly, these results suggest that the FvarF_{\rm var}–mass anti-correlation is always conserved and probably stronger with additional samples. The lag–mass relation, on the other hand, may be weaker if the observed FvarF_{\rm var} and lags become more independent with an increasing number of newly-discovered AGN.

Refer to caption

Refer to caption

Figure 8: Sieve diagrams of the mass and FvarF_{\rm var} (top panel) and the mass and lags (bottom panels) of the 3200 simulated reverberating AGN assuming a uniform distribution of the samples. The samples are drawn from the equally model-grid spacing of 0.02 for the FvarF_{\rm var} and 25 s for the lags (i.e. reverberating AGN has no preferable values of FvarF_{\rm var} and lags). Note that the expected frequency under independence and the observed frequency are proportional to the area of the rectangular and the corresponding number of inside squares, respectively. The differences between the observed and expected frequency are shown by the shading density. Blue (red) means the observed frequency is more (less) than the expected frequency. Under this assumption, FvarF_{\rm var}–mass anti-correlation is stronger (rs=0.88r_{s}=-0.88) while the lag–mas correlation is drastically weaker (rs<0.1r_{s}<0.1), compared to Fig. 3. See text for more details.

We then investigate the case when newly-discovered reverberating AGN follow the current observed FvarF_{\rm var}–lag anti-correlation (rs=0.46r_{s}=-0.46). A probability density function is employed to generate random values of the lags and FvarF_{\rm var} by allowing small deviations (2, 5, 10 and 15 per cent) from the current trend of FvarF_{\rm var}–lag anti-correlation. Once the lags and FvarF_{\rm var} are drawn, the associating mass predicted by the neural network model can be assigned. We produce 3200 AGN sources and find that the lag–mass correlation coefficient in this case varies between 0.45rs0.590.45\lesssim r_{s}\lesssim 0.59, which can be smaller or slightly larger than the correlation coefficient for only 22 observed reverberating AGN samples. Meanwhile, the FvarF_{\rm var}–mass anti-correlation for these 3200 sources varies between 0.82rs0.71-0.82\lesssim r_{s}\lesssim-0.71, which is comparable or stronger than that of the reverberation samples so far.

Fig. 9 represents, as an example, the Sieve diagrams for the black hole mass of 3200 simulated AGN following the current trend of FvarF_{\rm var}–lag relations, with ±2\pm 2 per cent deviations allowed. The pattern of association as in Fig. 3 is observed. The Venn diagrams for these newly-simulated AGN are presented in Fig. 10. It can be seen that most of the simulated sources (85\sim 85 per cent) are those that contain M>106.5MM>10^{6.5}M_{\odot} and show Fvar<0.5F_{\rm var}<0.5, with 74\sim 74 per cent of these samples display the lags of <1000<1000 s. Note that this is an example in the case that we allow the deviation of the FvarF_{\rm var}–lag relations from the current trend to be only ±2\pm 2 per cent, so the samples associated with the top-right portion of the plot in Fig. 6 (Fvar0.5F_{\rm var}\gtrsim 0.5, lags 1000\gtrsim 1000 s and M106.5MM\lesssim 10^{6.5}M_{\odot}) are not produced. Therefore, the lag–mass correlation coefficient does not change much. The corresponding Fe-K lag–mass relation is also presented in Fig. 11. These results suggest that the tight anti-correlation between the lag and FvarF_{\rm var} is necessary to maintain the lag–mass scaling relation. On the other hand, the FvarF_{\rm var}–mass anti-correlation is likely stronger whether or not new reverberating AGN show dependence of their lags on FvarF_{\rm var}.

Refer to caption

Refer to caption

Figure 9: Sieve diagrams of the mass and FvarF_{\rm var} (top panel) and the mass and lags (bottom panels) of 3200 reverberating AGN sources simulated under the constraint of the observed FvarF_{\rm var}–lag relation (rs=0.46r_{s}=-0.46). In this illustration, the deviation of the FvarF_{\rm var}–lag relation is 2\sim 2 per cent of the current trend. We find rs=0.59r_{s}=0.59 for the lags and the mass, and rs=0.77r_{s}=-0.77 for the FvarF_{\rm var} and the mass. These correlation coefficients are comparable to those of the observed 22 reverberating AGN (Fig. 3). Therefore, the lag–mass scaling relation is maintained through the tight anti-correlation between the lag and FvarF_{\rm var}, and can be weaker if newly-discovered AGN samples are more uniformly distributed (Fig. 8).

Refer to caption

Figure 10: Venn diagram showing the overlap of data instances from a collection of 3200 reverberating AGN sources, as in Fig. 9, when the ±2\pm 2 per cent deviation of the FvarF_{\rm var}–lag relations from the current trend is allowed. The number of samples corresponding to each criterion is presented. Since the ±2\pm 2 per cent deviation is small, we see no simulated sample that shows Fvar0.5F_{\rm var}\gtrsim 0.5 with the lags 1000\gtrsim 1000 s and M106.5MM\lesssim 10^{6.5}M_{\odot} (i.e. the top-right portion of the plot in Fig. 6). Majority of the simulated samples shows Fvar<0.5F_{\rm var}<0.5 with the lags <1000<1000 s and M>106.5MM>10^{6.5}M_{\odot}, which is close to the trend of the observed reverberating AGN so far.

Refer to caption

Figure 11: Fe-K lag amplitude versus mass for 3200 newly-simulated reverberating AGN sources under the constraint of currently-observed FvarF_{\rm var}–lag relations, with ±2\pm 2 per cent deviations allowed. The size and colour of the data points relate to the values of the corresponding FvarF_{\rm var} (e.g. larger data-point size means larger FvarF_{\rm var}). We find the Spearman’s rank coefficient between the lag and the mass is +0.59, which is not significantly different from the obtained correlation coefficient for just 22 observed AGN samples in the XMM-Newton archive.

5 Discussion

According to our results, the developed ML models can make higher accurate predictions of the black hole mass than the linear regression model. The combination of the solver and activation function is dependent on the characteristics of the data. All neural network models here require L-BFGS-B solver. Generally, L-BFGS-B solver is a suitable second-order optimization method and is fastest for small convex problems or small data size, appropriate for our data set. It employs the full training set to obtain the later update to parameters at every iteration (Le et al., 2011). If the size of the dataset is large, L-BFGS-B computing time can be very long on a single machine, so it is hard to incorporate new data in an online environment. On the other hand, tanh is found to be the best activation function that provides high accuracy for generalized MLP architectures of neural networks (Karlik & Olgac, 2011; Montavon, Orr, & Müller, 2012). If the FvarF_{\rm var} is only a single feature, the correlation between FvarF_{\rm var} and log(M/MM/M_{\odot}) is linearly dependent. This is why the ReLu which is rectified linear unit activation functions is better than tanh (full non-linearity unit). All combined-feature models (R20.9R^{2}\gtrsim 0.9) developed here require the L-BFGS-B solver and tanh activation function, which agree with other works that use multi-feature/variable regression in MLP neural networks (e.g. Gheorghe & Badea, 2014; Artrith, Urban, & Ceder, 2017).

Kara et al. (2016a) found that the correlation coefficient of the frequency–mass relation (rs=0.68r_{s}=-0.68) is stronger than that of the lag–mass relation (rs=0.60r_{s}=0.60). They suggested using the frequency–mass relation for determining the black hole mass instead of the lag–mass relation. This argument, however, may be specific to the averaging scheme applied to the observational data (Hancock et al., in prep.). Here, we find the correlation between the Fe-K lag amplitude and the mass for AGN sources to be rs=0.589r_{s}=0.589, confirming what was reported by Kara et al. (2016a). Even though both FvarF_{\rm var} and log(CC) are produced from the lightcurves extracted in the same energy band of 2–10 keV, it is clear that the black hole mass shows a significantly stronger correlation with FvarF_{\rm var} than log(CC). Also, both log(LbolL_{\rm bol}) and log(CC) show a relatively weak correlation with the black hole mass. This suggests that the X-ray timing data may be more useful for predicting the black hole mass than the time-average X-ray data.

Hancock et al. (in prep.) studied the correlations between the central mass and the lags, but using the lags measured between the soft excess 0.3–0.8 keV and continuum-dominated 1–4 keV bands. The correlation coefficient was found to be rs=0.72r_{s}=0.72. This may suggest that the lags in the soft band are stronger correlated to the mass than the lags in Fe-K band. Although the Fe-K band is sometimes contaminated by the neutral distant reflection, the distant reflection varies on different timescales from the timescales of the inner-disc reflection. Therefore, unlike the soft excess lags whose origin is ambiguous, the Fe-K lags are likely to be a clean signature of X-ray reverberation. The Fe-K band then may provide a more accurate reflection of how much the signals produced via X-ray reverberation correlate with the black hole mass.

We also analyze the correlation between the black hole mass and other features of the ML model. Interestingly, we find that the correlation between the mass and the FvarF_{\rm var} is rs=0.718r_{s}=-0.718, which is stronger than the lag–mass relation and even stronger than the frequency–mass relation reported by Kara et al. (2016a). However, the prediction accuracy is higher when using the lags alone than when using the FvarF_{\rm var} alone. Nevertheless, for a specific source, different individual or combined observations into, e.g., high flux and low flux states result in different time lag estimates for a single mass. Therefore, only one parameter cannot be a good predictor for the mass since its value changes with how the data of each AGN are selected and combined while the mass remains constant. It is then better to use both FvarF_{\rm var} and lag amplitude as the mass predictors that, evidently, can also improve the model accuracy (R2=0.9124R^{2}=0.9124).

Based on the FvarF_{\rm var}–lag–mass relation predicted by the ML model, the reverberating AGN with a specific black hole mass can exhibit FvarF_{\rm var} and lags within a limited regime. The model can rule out the possibility for the majority of the non-reverberating AGN to display the lags (Table 4 and Fig. 7), revealing its potential to make accurate predictions for new data. Since the model can be successfully applied to both reverberating and non-reverberating AGN, it suggests the common origins of main variability that contributes to FvarF_{\rm var} for both AGN groups. This suits the framework of the X-ray variability driven by, e.g, the disc propagating-fluctuations that can operate in both reverberating and non-reverberating AGN.

Note that the model still predicts the presence of the Fe-K lags for some sources included in the non-reverberation group (Table 4). For example, the Fe-K lags of IZw1 are predicted to be 600\lesssim 600 s. In fact, this is in agreement with Wilkins et al. (2021) who reported observations of X-ray flares around the central black hole in IZw1 and detected the Fe-K reverberation lags of 746±157\sim 746\pm 157 s by using a different method to the standard Fourier analysis. Moreover, based on the XMM-Newton observations, Zoghbi et al. (2013) reported the Fe-K reverberation lags in MCG–5–23–16 and SWIFT J2127.4+5654, but new analysis using NuSTAR data suggested no strong evidence for relativistic reverberation in both AGN (Zoghbi, Miller, & Cackett, 2021). It is not straightforward to determine if these AGN exhibit no intrinsic reverberation lags or if the inconsistencies arise due to different methods and model assumptions used to quantify the significance of the lag. Nevertheless, we try moving these samples to the non-reverberating AGN group, and find only a slight change in the lag–mass correlations inferred (e.g., lag-mass correlation coefficient rsr_{s} changes less than 4\sim 4 per cent). This is because these inconsistencies are found only in the minority of the samples. Therefore, moving them across different groups (reverberation and non-reverberation) does not change the trend of the key results here.

The X-ray reverberation produces the dip in the PSD profiles so it dilutes the FvarF_{\rm var} in a particular timescale where reverberation dominates (Papadakis et al., 2016; Chainakun, 2019). The amplitude of the dip increases with increasing the reflection fraction, resulting in a decrease in FvarF_{\rm var} on a particular reverberation timescales. The mass, on the other hand, scales up the intrinsic lags. However, FvarF_{\rm var} here represents the fractional excess variance contributed by all mechanisms that affect the X-ray variability in 2–10 keV band, not only by the reverberation. For example, the FvarF_{\rm var} can be affected by the constant or less-variable emission component, e.g. from outflowing gas, that is varied among different AGN (Parker et al., 2021). The constant emission component raises the mean but does not affect the standard deviation of the data, so the FvarF_{\rm var} decreases. On the other hand, the constant absorption lowers both mean and standard deviation of the data proportionally, hence its presence has small effects on the FvarF_{\rm var}. The deviation of the FvarF_{\rm var}–lag relation then may be induced by, e.g., the presence of the constant or less-variable emission component, resulting in the change of the lag–mass correlation coefficient.

Our results suggest that to maintain the lag–mass scaling relation, the anti-correlation between the lag and FvarF_{\rm var} must preserve. Contrarily, the FvarF_{\rm var}–mass anti-correlation seems to be stronger even if new reverberating AGN show more independence of their lags on FvarF_{\rm var}. Moreover, the model suggests the regime of Fvar0.5F_{\rm var}\gtrsim 0.5, lags 1000\gtrsim 1000 s and M106.5MM\lesssim 10^{6.5}M_{\odot} where the reverberating AGN are not yet observed (top-right portion of Fig. 6). It possibly represents a forbidden regime that is unphysical, which is why there is still no reverberating AGN sample fitting into. Although the AGN with constant emission component can introduce the deviation of the FvarF_{\rm var}–lag correlation coefficient, they may not fit into this ambiguous regime. This regime corresponds to the samples with high FvarF_{\rm var}, while constant emission produces negative effects on FvarF_{\rm var}.

Furthermore, the observed flux in a particular energy band always contains both continuum and reflection flux. This introduces dilution effects to the lags so that the observed lags are smaller than the intrinsic lags (e.g. Wilkins & Fabian, 2013; Chainakun & Young, 2015). Chainakun et al. (2019) studied the lags produced by an extended corona under the inverse-Compton scattering scenario and found that, with a fixed coronal size and geometry, the coronal temperature and optical depth can affect the lags. The more the complex model is, the less the measured reverberation lags can straightforwardly relate to the true light-travel distance. In fact, Hinkle & Mushotzky (2021) investigated the fundamental X-ray corona properties of 33 AGN observed under the 105-month Swift/BAT campaign together with the archival XMM-Newton and NuSTAR data. They found no strong correlations between the black hole mass, coronal compactness and coronal temperature. Therefore, if the lags significantly depend on the coronal compactness or temperature, the lags may not strongly correlate with the mass. The less correlation between the lags and the mass, if observed, then may suggest the extended corona framework where the lag amplitudes of the majority of the sources are more strongly affected by the properties of the coronal extent rather than the geometry.

According to our results, the FvarF_{\rm var} and lag amplitude are the key parameters of the X-ray variability data to predict the black hole mass. The clear trend of how the mass varies with log(CC) and zz is not clearly seen. Based on the current XMM-Newton observations, we find zz is correlated with LbolL_{\rm bol} (rs=0.796r_{s}=0.796), but there is no clear pattern of associations between z and other parameters (time lags, mass, and FvarF_{\rm var}). Perhaps, this is because the samples probed by XMM-Newton are only those relatively nearby and are not distributed widely enough across a broad range of redshift. Fig. 12 represents the distribution of our reverberating AGN samples into the parameter space when the redshift zz is discretized into 2 groups: z<0.015z<0.015 (low redshift) and z0.015z\geq 0.015 (high redshift). The results show that we tend to observe the high-redshift sources when they exhibit relatively large LbolL_{\rm bol}. The pattern of associations between zz and other parameters cannot be easily revealed, however, even though the samples are just grouped into low and high redshifts. This is why we still cannot obtain a meaningful interpretation for the redshift associated with the lag and the mass. Note that these results do not imply that there is no effect of redshift at all, they instead suggest that the amount of current samples is not enough for the model to gain useful information relating to zz. In the Athena era, the model should be able to provide more insights into not only the trend of scaling relations, but also the redshift dependence, when we can probe more reverberating sources at higher redshift.

It is also possible to calculate FvarF_{\rm var} in the frequency domain by integrating the corresponding power spectrum between two frequencies, so that the obtained FvarF_{\rm var} is specific to a particular range of timescales, which will be investigated in the future work. Finally, there were also reports on the correlation between the central black hole mass and the host galaxy total stellar mass (e.g. Reines & Volonteri, 2015). Investigating the relationship between the AGN and host-galaxy parameters using machine learning techniques is also planned for the future.

Refer to caption
Figure 12: Pairwise relationships between the variables of current reverberating AGN samples when the redshift zz is discretized into 2 groups: z<0.015z<0.015 (low redshift) and z0.015z\geq 0.015 (high redshift). The diagonal plots represent a univariate marginal distribution of the data in each column, derived from a layered kernel density estimate (KDE). We can see the high redshift samples occupy the parameter space associated with high LbolL_{\rm bol}. However, the clear separation between low and high zz samples cannot be clearly seen when the data are plotted in the lag, mass and FvarF_{\rm var} parameter space. The amount of current XMM-Newton samples is not enough for the ML model to gain useful information of the lag, mass and FvarF_{\rm var} correlating with zz.

6 Conclusion

We investigate several scenarios of applying the neural network to predict the black hole mass and correlations in AGN. The model does not require the assumed source geometry in advance, but this does not mean that the inferred correlations are independent of source geometry. The predictions by the neural network models are much more accurate than those from the standard linear regression in all cases, providing an independent way to predict the AGN mass. The best model that uses either FvarF_{\rm var} or the lag alone contains 90\sim 90–130 neurons (R20.6R^{2}\sim 0.6–0.7 and MAE 0.2\sim 0.2–0.3). It is, however, better to use both FvarF_{\rm var} and the lag amplitude to predict the central mass (R2=0.9124R^{2}=0.9124 and MAE = 0.1077), which requires 168 neurons. The mass distribution predicted by the model satisfies the lag–mass scaling relation regarding to the observed reverberating AGN samples. Despite of the limited number of available samples for training the machine, we illustrate that the model can potentially be used to exclude and identify the AGN samples that do not exhibit the lags. This, in turn, suggests the common origins of the variability, e.g. disc propagating-fluctuations, that drives FvarF_{\rm var} in both reverberating and non-reverberating AGN.

The FvarF_{\rm var}–mass anti-correlation is always true and probably stronger with an increasing number of sources. The model reveals the regime in the parameter space where there is still no reverberating AGN samples lying into (Fvar0.5F_{\rm var}\gtrsim 0.5 with the lags 1000\gtrsim 1000 s and M106.5MM\lesssim 10^{6.5}M_{\odot}). In fact, there is no straightforward mechanism for the AGN to operate high FvarF_{\rm var} and large lags with such a small mass. Perhaps, this is the reason why we do not observe reverberating AGN in this regime. The model shows that the less correlation between the lag and the mass is possible with increasing number of the newly-discovered reverberating AGN. The AGN inducing this may require an extended corona whose property (e.g. temperature and compactness) rather than geometry is more dominant to the measured time lags. The hints of their presence then may be a noticeable decrease in the lag–mass correlation coefficient.

The more accurate mass can be retrieved using all features investigated here (R2=0.9993R^{2}=0.9993 and MAE = 0.0100). The neural network in this case contains 249 neurons. The model with all features coupled together is probably too complex, so the clear pattern of association to the relation between zz and log(CC) is not observed. The obtained neural network models are observational driven since they are trained and tested using the real observational data. The current Fe-K reverberation samples observed so far are representative enough in the way that the model can draw some meaningful correlations between their observable variables. It is possible that the source spectral state may have an effect on the detection probability. For example, super soft sources occupying at the high end of the LbolL_{\rm bol} parameter space may have poor count rates in the hard band so detection of Fe-K lags become more difficult. Regarding the obtained correlation between LbolL_{\rm bol} and the mass which is very weak (Table 2), it will require a very large number of these sources in order to drive the significant and meaningful relationship between LbolL_{\rm bol} and the mass in the samples. We suspect that these extreme sources are likely not the majority of populations, hence do not significantly alter the inferred correlations. This work illustrates an application of the ML technique to construct the multi-feature parameter space where the samples of reverberating AGN can be simulated. More number of newly-discovered reverberating AGN will help place robust constraint on the extrapolating results predicted by the models.

Acknowledgements

This research has received funding support from the NSRF via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation (grant number B16F640076). The ML models in this work are adopted from the sklearn software package. The calculations were carried out using the high performance computing resources in the Institute of Science and the Centre for Scientific and Technological Equipment, Suranaree University of Technology.

Data availability

The data underlying this article can be accessed from XMM-Newton Observatory (http://nxsa.esac.esa.int). The neural network algorithm adopted here is available in scikit-learn at https://scikit-learn.org/. The derived data and developed model underlying this article will be shared on reasonable request to the corresponding author.

References

  • Agís-González et al. (2014) Agís-González B., Miniutti G., Kara E., Fabian A. C., Sanfrutos M., Risaliti G., Bianchi S., et al., 2014, MNRAS, 443, 2862.
  • Alston et al. (2014) Alston W. N., Markeviciute J., Kara E., Fabian A. C., Middleton M., 2014, MNRAS, 445, L16.
  • Alston et al. (2015) Alston W. N., Parker M. L., Markevičiūtė J., Fabian A. C., Middleton M., Lohfink A., Kara E., et al., 2015, MNRAS, 449, 467.
  • Alston et al. (2020) Alston W. N., Fabian A. C., Kara E., Parker M. L., Dovciak M., Pinto C., Jiang J., et al., 2020, NatAs, 4, 597.
  • Artrith, Urban, & Ceder (2017) Artrith N., Urban A., Ceder G, 2017, Phys. Rev. B, 96, 1.
  • Bentz & Katz (2015) Bentz M. C., Katz S., 2015, PASP, 127, 67.
  • Bian & Zhao (2003) Bian W., Zhao Y., 2003, MNRAS, 343, 164.
  • Caballero-García et al. (2018) Caballero-García M. D., Papadakis I. E., Dovčiak M., Bursa M., Epitropakis A., Karas V., Svoboda J., 2018, MNRAS, 480, 2650.
  • Caballero-García et al. (2020) Caballero-García M. D., Papadakis I. E., Dovčiak M., Bursa M., Svoboda J., Karas V., 2020, MNRAS, 498, 3184.
  • Cackett et al. (2014) Cackett E. M., Zoghbi A., Reynolds C., Fabian A. C., Kara E., Uttley P., Wilkins D. R., 2014, MNRAS, 438, 2980.
  • Cackett, Bentz, & Kara (2021) Cackett E. M., Bentz M. C., Kara E., 2021, iSci, 24, 102557.
  • Chainakun & Young (2015) Chainakun P., Young A. J., 2015, MNRAS, 452, 333.
  • Chainakun, Young, & Kara (2016) Chainakun P., Young A. J., Kara E., 2016, MNRAS, 460, 3076.
  • Chainakun & Young (2017) Chainakun P., Young A. J., 2017, MNRAS, 465, 3965.
  • Chainakun (2019) Chainakun P., 2019, ApJ, 878, 20.
  • Chainakun et al. (2019) Chainakun P., Watcharangkool A., Young A. J., Hancock S., 2019, MNRAS, 487, 667.
  • Chainakun et al. (2021a) Chainakun P., Mankatwit N., Thongkonsing P., Young A. J., 2021a, MNRAS, 506, 5318.
  • Chainakun et al. (2021b) Chainakun P., Luangtip W., Young A. J., Thongkonsing P., Srichok M., 2021b, A&A, 645, A99.
  • De Marco et al. (2013) De Marco B., Ponti G., Cappi M., Dadina M., Uttley P., Cackett E. M., Fabian A. C., et al., 2013, MNRAS, 431, 2441.
  • Demšar et al. (2013) Demšar J., Curk T., Erjavec A., Gorup Č., Hočevar T., Milutinovič M., Možina M., et al., 2013, J. Mach. Learn. Res., 14, 2349.
  • Emmanoulopoulos et al. (2014) Emmanoulopoulos D., Papadakis I. E., Dovčiak M., McHardy I. M., 2014, MNRAS, 439, 3931.
  • Emmanoulopoulos et al. (2016) Emmanoulopoulos D., Papadakis I. E., Epitropakis A., Pecháček T., Dovčiak M., McHardy I. M., 2016, MNRAS, 461, 1642.
  • Epitropakis et al. (2016) Epitropakis A., Papadakis I. E., Dovčiak M., Pecháček T., Emmanoulopoulos D., Karas V., McHardy I. M., 2016, A&A, 594, A71.
  • Fabian et al. (2009) Fabian A. C., Zoghbi A., Ross R. R., Uttley P., Gallo L. C., Brandt W. N., Blustin A. J., et al., 2009, Natur, 459, 540.
  • Gheorghe & Badea (2014) Gheorghe R., Badea L. M., 2014, Technological and Economic Development of Economy, 20, 1.
  • González-Martín & Vaughan (2012) González-Martín O., Vaughan S., 2012, A&A, 544, A80.
  • Grier et al. (2013) Grier C. J., Martini P., Watson L. C., Peterson B. M., Bentz M. C., Dasyra K. M., Dietrich M., et al., 2013, ApJ, 773, 90.
  • Hinkle & Mushotzky (2021) Hinkle J. T., Mushotzky R., 2021, MNRAS, 506, 4960.
  • Ingram et al. (2019) Ingram A., Mastroserio G., Dauser T., Hovenkamp P., van der Klis M., García J. A., 2019, MNRAS, 488, 324.
  • Iwasawa et al. (2016) Iwasawa K., Fabian A. C., Kara E., Reynolds C. S., Miniutti G., Tombesi F., 2016, A&A, 592, A98.
  • Kara et al. (2013a) Kara E., Fabian A. C., Cackett E. M., Steiner J. F., Uttley P., Wilkins D. R., Zoghbi A., 2013a, MNRAS, 428, 2795.
  • Kara et al. (2013b) Kara E., Fabian A. C., Cackett E. M., Miniutti G., Uttley P., 2013b, MNRAS, 430, 1408.
  • Kara et al. (2013c) Kara E., Fabian A. C., Cackett E. M., Uttley P., Wilkins D. R., Zoghbi A., 2013c, MNRAS, 434, 1129.
  • Kara et al. (2014) Kara E., Cackett E. M., Fabian A. C., Reynolds C., Uttley P., 2014, MNRAS, 439, L26.
  • Kara et al. (2015) Kara E., Zoghbi A., Marinucci A., Walton D. J., Fabian A. C., Risaliti G., Boggs S. E., et al., 2015, MNRAS, 446, 737. doi:10.1093/mnras/stu2136
  • Kara et al. (2016a) Kara E., Alston W. N., Fabian A. C., Cackett E. M., Uttley P., Reynolds C. S., Zoghbi A., 2016a, MNRAS, 462, 511.
  • Kara et al. (2016b) Kara E., Miller J. M., Reynolds C., Dai L., 2016b, Natur, 535, 388.
  • Kara et al. (2019) Kara E., Steiner J. F., Fabian A. C., Cackett E. M., Uttley P., Remillard R. A., Gendreau K. C., et al., 2019, Natur, 565, 198.
  • Karlik & Olgac (2011) Karlik B., Olgac A. V., 2011, Int. J. Artif. Intell. Expert Syst., 1, 4.
  • Kawamura et al. (2021) Kawamura T., Axelsson M., Done C., Takahashi T., 2021, arXiv, arXiv:2107.12517
  • King, Lohfink, & Kara (2017) King A. L., Lohfink A., Kara E., 2017, ApJ, 835, 226.
  • Le et al. (2011) Le Q. V., Ngiam J., Coates A., Lahiri A., Prochnow B., Ng A. Y., 2011, On Optimization Methods for Deep Learning. Omnipress, Madison, WI, USA, p. 265
  • Luangtip et al. (2021) Luangtip W., Chainakun P., Loekkesee S., Deesamer C., Ngonsamrong T., Sintusiri T., 2021, MNRAS, 507, 6094.
  • Mahmoud, Done, & De Marco (2019) Mahmoud R. D., Done C., De Marco B., 2019, MNRAS, 486, 2137.
  • Malizia et al. (2008) Malizia A., Bassani L., Bird A. J., Landi R., Masetti N., de Rosa A., Panessa F., et al., 2008, MNRAS, 389, 1360.
  • Marconi et al. (2008) Marconi A., Axon D. J., Maiolino R., Nagao T., Pastorini G., Pietrini P., Robinson A., et al., 2008, ApJ, 678, 693.
  • Marinucci et al. (2014) Marinucci A., Matt G., Kara E., Miniutti G., Elvis M., Arevalo P., Ballantyne D. R., et al., 2014, MNRAS, 440, 2347. doi:10.1093/mnras/stu404
  • Montavon, Orr, & Müller (2012) Montavon G., Orr G. B., Müller K. R, 2012, Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, 7700. Springer, Berlin Heidelberg
  • Papadakis et al. (2010) Papadakis I. E., Brinkmann W., Gliozzi M., Raeth C., Nicastro F., Conciatore M. L., 2010, A&A, 510, A65.
  • Papadakis et al. (2016) Papadakis, I., Pecháček, T., Dovčiak, M., et al. 2016, A&A, 588, A13
  • Reines & Volonteri (2015) Reines A. E., Volonteri M., 2015, ApJ, 813, 82.
  • Schulz, Knake, & Schmidt-Kaler (1994) Schulz H., Knake A., Schmidt-Kaler T., 1994, A&A, 288, 425
  • Parker et al. (2021) Parker M. L., Alston W. N., Härer L., Igo Z., Joyce A., Buisson D. J. K., Chainakun P., et al., 2021, MNRAS, 508, 1798.
  • Pedregosa et al. (2012) Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., et al., 2012, arXiv, arXiv:1201.0490
  • Ponti et al. (2012) Ponti G., Papadakis I., Bianchi S., Guainazzi M., Matt G., Uttley P., Bonilla N. F., 2012, A&A, 542, A83.
  • Uttley et al. (2014) Uttley P., Cackett E. M., Fabian A. C., Kara E., Wilkins D. R., 2014, A&ARv, 22, 72.
  • Vaughan, Fabian, & Nandra (2003) Vaughan S., Fabian A. C., Nandra K., 2003, MNRAS, 339, 1237.
  • Vasudevan & Fabian (2007) Vasudevan R. V., Fabian A. C., 2007, MNRAS, 381, 1235.
  • Vasudevan & Fabian (2009) Vasudevan R. V., Fabian A. C., 2009, MNRAS, 392, 1124.
  • Vasudevan et al. (2010) Vasudevan R. V., Fabian A. C., Gandhi P., Winter L. M., Mushotzky R. F., 2010, MNRAS, 402, 1081.
  • Vincentelli et al. (2020) Vincentelli F. M., Mastroserio G., McHardy I., Ingram A., Pahari M., 2020, MNRAS, 492, 1135. doi:10.1093/mnras/stz3511
  • Wang, Watarai, & Mineshige (2004) Wang J.-M., Watarai K.-Y., Mineshige S., 2004, ApJL, 607, L107.
  • Wilkins & Fabian (2013) Wilkins D. R., Fabian A. C., 2013, MNRAS, 430, 247.
  • Wilkins et al. (2016) Wilkins D. R., Cackett E. M., Fabian A. C., Reynolds C. S., 2016, MNRAS, 458, 200.
  • Wilkins et al. (2021) Wilkins D. R., Gallo L. C., Costantini E., Brandt W. N., Blandford R. D., 2021, Natur, 595, 657.
  • Zoghbi et al. (2012) Zoghbi A., Fabian A. C., Reynolds C. S., Cackett E. M., 2012, MNRAS, 422, 129.
  • Zoghbi et al. (2013) Zoghbi A., Reynolds C., Cackett E. M., Miniutti G., Kara E., Fabian A. C., 2013, ApJ, 767, 121.
  • Zoghbi et al. (2014) Zoghbi A., Cackett E. M., Reynolds C., Kara E., Harrison F. A., Fabian A. C., Lohfink A., et al., 2014, ApJ, 789, 56.
  • Zoghbi, Miller, & Cackett (2021) Zoghbi A., Miller J. M., Cackett E., 2021, ApJ, 912, 42.