 Research
 Open Access
 Published:
A continuous data driven translational model to evaluate effectiveness of populationlevel health interventions: case study, smoking ban in public places on hospital admissions for acute coronary events
Journal of Translational Medicine volume 18, Article number: 466 (2020)
Abstract
Background
An important task in developing accurate public health intervention evaluation methods based on historical interrupted time series (ITS) records is to determine the exact lag time between pre and postintervention. We propose a novel continuous transitional datadriven hybrid methodology using a nonlinear approach based on a combination of stochastic and artificial intelligence methods that facilitate the evaluation of ITS data without knowledge of lag time. Understanding the influence of implemented intervention on outcome(s) is imperative for decision makers in order to manage health systems accurately and in a timely manner.
Methods
To validate a developed hybrid model, we used, as an example, a published dataset based on a real health problem on the effects of the Italian smoking ban in public spaces on hospital admissions for acute coronary events. We employed a continuous methodology based on data preprocessing to identify linear and nonlinear components in which autoregressive moving average and generalized structure group method of data handling were combined to model stochastic and nonlinear components of ITS. We analyzed the rate of admission for acute coronary events from January 2002 to November 2006 using this new datadriven hybrid methodology that allowed for longterm outcome prediction.
Results
Our results showed the Pearson correlation coefficient of the proposed combined transitional datadriven model exhibited an average of 17.74% enhancement from the single stochastic model and 2.05% from the nonlinear model. In addition, data demonstrated that the developed model improved the mean absolute percentage error and correlation coefficient values for which 2.77% and 0.89 were found compared to 4.02% and 0.76, respectively. Importantly, this model does not use any predefined lag time between pre and postintervention.
Conclusions
Most of the previous studies employed the linear regression and considered a lag time to interpret the impact of intervention on public health outcome. The proposed hybrid methodology improved ITS prediction from conventional methods and could be used as a reliable alternative in public health intervention evaluation.
Introduction
Due to advances in technology and improvements in recording reliable data and sharing methods, the time series (TS) concept has emerged in many theoretical and practical studies over the past few decades [1]. This concept allows researchers to access the outcome of any phenomenon or intervention, at any time, with minimum cost and effort, and to plan possible solutions and control measures based on the forecasted data [2]. Therefore, improving knowledge about studying TS, preprocessing, modeling and, if needed, postprocessing is imperative [3].
In the domain of public health interventions, the interrupted time series (ITS) concept has been widely employed to evaluate the impact of a new intervention at a known point in time in routinely observed data [4,5,6,7,8,9,10]. ITS is fundamentally a sequence of outcomes over uniformly timespaced intervals that are affected by an intervention at specific points in time or by change points. The outcome of interest shows a variation from its previous pattern due to the effect of the intervention. The applied intervention splits TS data into pre and postintervention periods. Based on this definition, Wagner et al. [11] proposed segmented regression analysis for evaluating intervention impacts on the outcomes of interest in ITS studies. In this approach, the choice of each segment is based on the change point, with the possible additional time lag in some cases, in order for the intervention to have an effect [12,13,14,15,16,17]. In addition, for pre and postintervention period segments of a TS, the level and trend values should be determined either by linear [17] or nonlinear [6] approaches. Therefore, accurate values of the change point and time lag parameters are essential in segmented regression analysis.
Affecting an intervention at a change point produces different possible outcome patterns in the postintervention period for both level and trend parameters. Figure 1 illustrates some possible impacts of an intervention on the postintervention period. As shown in Fig. 1a–c, a change in level (or intercept) may lead to a change in level after a time lag or a temporary level change after the intervention. Other possible patterns are a change in slope (or trend) with a change in slope after a time lag, or a temporary slope change as shown in Fig. 1d–f, respectively. In some cases, a change in both of these parameters could take place as an immediate change, e.g., a change after a time lag or temporary level and slope changes (Fig. 1g–i).
Regardless of the popularity and consensus on using segmented regressionbased methods for solving ITS problems, selecting the most appropriate time lag is a challenging task with an important impact on results in this type of modeling. The reason for the delicacy of this task is that there is no specific rule to define the time lag produced between the pre and postintervention periods. In some cases, the outcomes of interventions have an unknown delayed response to the implemented strategies and a lag time may occur long after an intervention. However, in ITS modeling, when segmented regression approaches are used, the exact time lag after an intervention should be taken into consideration to guarantee modeling result accuracy and appropriateness. In addition, an undocumented change point seriously complicates ITS analysis. Applying a continuous nonlinear TS method is considered reliable if the ITS analysis can be released from all these fundamental concerns. Therefore, there is a necessity to introduce potential uses of linear, nonlinear or a combination of both models for solving such problems.
Over the past few years, soft computing methods have been employed across domains and have established reliable tools for modeling complex systems and predicting different phenomena in healthcare [18,19,20,21,22,23,24]. Among soft computing techniques, the Group Method of Data Handling (GMDH) is a common selforganizing heuristic model, which can be used for simulating complicated nonlinear problems. This evolutionary procedure is performed by dividing a complex problem into some smaller and simpler problems. Based on GMDH, this study proposes a novel methodology of the continuous modeling of an ITS based on data preprocessing. An example of the novel ITS modeling uses a linearbased stochastic model, a nonlinearbased model and an integration of a stochastic and a nonlinear model (hybrid). In order to run the models, certain tests and preprocessing methods are initially applied to the TS to prepare the data for stochastic modeling. It is crucial to investigate the structure of the TS being studied prior to modeling. Therefore, the TS undergoes stationarity testing along with normality testing. After surveying the characteristics of the TS, stationarizing methods appropriate to the TS are used. Then, in case of nonnormal distribution, a normal transformation is applied to the stationarized TS. For the second TS modeling approach, the dataset is modeled with an artificial intelligence (AI) method which is, in this case, the Generalized Structure Group Method of Data Handling (GSGMDH). In the third and final step, a hybrid model that combines the linear and nonlinear results is applied. Finally, the results are compared according to various indices and methods. Therefore, using this method facilitates modeling the ITS continuously, i.e. there is no need to identify the change point and intervention lag time.
Dataset description
BaroneAdesi et al. [25] carried out an extensive study on the effect of a smoking ban in public places on hospital admissions for acute coronary events (ACEs). In January 2005, Italy introduced legislation that prohibits smoking in indoor public spaces, the goal of which was the reduction of health issues caused by secondhand smoke [25]. Secondhand smoke consists of smoke exhaled by smokers and from lit cigarettes and causes numerous health problems in nonsmokers every year, as well as high treatment costs for both patients and the government. The ban was undertaken on 10 January 2005 to confront the growing trend of ACEs and to control this problem.
Bernal et al. [6] used a subset of ACEs data from subjects in Sicily, Italy, between 2002 and 2006 among those aged 0–69 years. They analysed the ITS data by applying segmented linear regression to the standardized rate of ACEs TS associated with the implementation of a ban on smoking in all indoor public places, to calculate the change in the subsequent outcome levels and trends. Based on BaroneAdesi et al.’s [25] assumption, Bernal et al. [6] considered only a level change in ACEs occurring and there was no lag between the pre and postsegments in the modeling procedure.
Here, we used the dataset from Bernal et al. (Fig. 4, [6]) as an example to illustrate the proposed method’s performance in ITS simulation from real data regarding a health problem; it is not meant to contribute to the substantive evidence on the topic. The dataset employed comprises routine hospital admissions with 600–1100 ACEs. More information about the dataset can be found in the BaroneAdesi et al. article [25].
Methods
Preprocessing
Time series are data recorded continuously and based on time to institute a sequence of measures, each of which refers to a time. Thus, the ACE data collected monthly from 2002 to 2006 is a TS. Each TS consists of four terms: jump + trend + period + stochastic component. The first three terms, known as deterministic terms, are calculable and removable. The jump term represents the sudden changes that occur in TS. These changes are detectable as steps in TS plots or by numerical tests. The trend term represents the gradual upward or downward changes that take place during a long period of time; this term is denoted in TS as a linear fitted line. The third deterministic term, the period, represents the periodic alternations in TS, which are seen as sinusoidal variations. Therefore, only the remaining stochastic term is required for use in stochastic or nonlinear modeling. This term is achieved while stationarity (absence of a deterministic term) occurs. Numerous tests and methods exist for investigating and omitting deterministic terms and some are presented below.
In stochastic modeling, two conditions must be met: the first is the stationarity (for details see below, Stationarizing methods); second, the distribution of the TS should be normal. Thus, in order to start stochasticbased modeling, the existence of deterministic terms must be checked, and when present, they should be removed. The Mann–Whitney (MW), Fisher, and Mann–Kendall (MK) tests are employed to check the jump, period and trend, respectively; the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test to assess the overall stationarity of the TS; and the Jarque–Bera (JB) test to check the normality of the TS.
Trend
A nonparametric test is used to assess the trend term in the studied TS. The MK test was developed to detect the gradual changes in TS, both seasonal and nonseasonal. The test equation is as follows [26]:
where U_{MK} is the standard Mann–Kendall statistic, MK is the Mann–Kendall statistic, and var(MK) is the variance of MK. MK and var(MK) are defined as:
where p is the number of identical groups, t_{g} is the observation number in the g^{th} group, sgn is the sign function, and N is the number of samples.
The (MK) test equation for a seasonal trend is expressed as follows:
where ω is the number of seasons in a year and σ_{ij} is the covariance of the statistic test in seasons i and j.
The trend in the TS is insignificant if \(U_{\alpha /2} < \,U_{MK} \, < \,U_{1  \alpha /2}\) and \(U_{\alpha /2} \, < \,U_{SMK} \, < \,U_{1  \alpha /2}\) and U_{α/2} and U_{1 − α/2} are the α/2 and 1 − α/2 quartiles of the normal cumulative probability distribution. A probability corresponding to the test statistic less than 5% means the absence of a significant trend in the TS.
Jump
A numerical survey for the jump term in the ACE TS, namely the nonparametric MW test, is employed as follows [27]:
where g(t) is the ascending ordered ACE series, R(g(t)) is the order of g(t), and N_{1}\(( x_{1}( t )\, = \,\{ {x( 1 ),x( 2 ), ..., x( {N_{1} )} \}} )\) and N_{2}\(( x_{2} ( t )\, = \,\{ x( N_{1} + 1), x( {N_{1} + 2} ),.., x( N ) \} )\) are the numbers of subseries of the main series, such that the sum of these series is equal to main series. If\(P_{{\left {U_{MW} } \right}}\) is larger than the significant level (in this study α = 0.01), then the jump term is insignificant.
Period
The significance of periodicity is investigated with the following statistic [28]:
where F^{*} is the Fisher statistic, N is the number of samples, α_{z} and β_{z} are Fourier coefficients, and Ω_{z} is the angular frequency. Α_{z}, β_{z} and Ω_{z} are defined as follows:
where f_{z} is the zth harmonic of the base frequency.
The periodicity related to Ω_{z} is significant if the critical value of the F distribution at a significant level (F(2, N2)) is lower than F^{*}:
For the considerable level of 0.05 (α = 0.05), the critical value of freedom degrees in the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test
The test is named after its authors [29] and is used to assess the overall stationarity of the ACE TS:
where
n is the number of TS, e_{t} is the residuals, and S_{t}^{2} is the average square of errors between time 1 and t. The statistic used for the "level" and "trend" stationarity tests is given by:
Kwiatkowski et al. [29] calculated the symmetric critical values via Monte Carlo simulation. The probability corresponding to a test statistic higher than 5% indicates stationarity.
Jarque–Bera (JB) test
The JB test [30] is applied to measure the the goodness of fit and the test statistic is expressed as follows:
where K_{u} is kurtosis, S_{k} is skewness and JB is a chisquare distribution with two degrees of freedom that can be used to assume the data is normal.
Stationarizing methods
Trend analysis
In case a significant trend term exists in the TS as detected in the MK, seasonal Mann–Kendall (SMK) or autocorrelation function (ACF) plot, a trend analysis is the best way to remove or reduce its impact on TS. Then, a linear line is fitted to the TS and is subtracted from the TS values; remaining is a detrended TS.
Differencing
One of the most widely employed methods of stationarizing TS is differencing. This method eliminates correlations in TS. The nonseasonal differencing method, which is the subtraction of each value from the previous one, removes the trend in variances and jumps. The equation is as follows:
where MED(t) represents a studied TS, in this case ACE, recorded at time t.
Stochastic modeling
The autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models are the two most conventional methods of the stochastic approach. The difference between these models is in the data differencing method of the ARIMA model, which makes it suitable for nonstationary TS. The equation for ARIMA(p, d, q) is as follows [31]:
where φ is the autoregressive (AR) process, θ the moving average (MA) parameter, ε(t) the residual, d the nonseasonal differencing, and p and q the AR and MA orders of the model parameters respectively. The value of these orders is determined through autocorrelation function (ACF) and partial autocorrelation (PACF) diagrams [31], I the differencing operator, and (1 − I)^{d} the dth nonseasonal differencing. In the ARMA model, d is equal to 0 and it does not have the differencing operator.
As it is crucial to investigate the structure of the TS being studied prior to modeling, certain tests and preprocessing methods were initially applied to prepare the data for stochastic modeling. After separation of the dataset into training and testing samples, the existence of deterministic terms in the TS should be examined. For this purpose, MW, MK and Fisher tests are employed to check the existence of Jump, Trend and Period (respectively).
If the results of these tests show no deterministic terms, the stationary TS must be checked. Otherwise, any deterministic terms should be eliminated. The KPSS test is applied to check the stationary TS. If the result of this test does not confirm the stationary TS, Trend analysis and differencing is applied and the KPSS test is applied again to check the stationary TS. After ensuring that the TS is stationary, the TS normality is evaluated using the JB test. After making sure that the TS is stationary and normal, the preprocessing is finished and stochastic modeling is initiated. Initially, depending on the type of problem, it is determined whether the problem is seasonal or not. Then, the range of seasonal and nonseasonal parameters related to auto regressive (AR) and moving average (MA) terms, as well as a constant term, are determined using ACF and PACF diagrams. The ACF and PACF diagrams only determine the most important lags, not the optimum ones.
It may be possible to obtain the optimal model; it does not require the use of all the parameters specified by these two diagrams. The first way to obtain the optimum combination is to examine all the compounds resulting from the defined domains for the stochastic model parameters (i.e. 2^{p(max)+q(max)} − 1 models for an ARMA model). Doing this is very timeconsuming as one has to examine all the comparisons and compare them, and the results in many models should be examined as well. Therefore, integrating a stochastic model with the continuous genetic algorithm (CGA) is used in the current study. Indeed, the optimal values of the seasonal MA and AR parameters are determined through an evolutionary process. Then, the residual independence of the proposed model is evaluated using the LjungBox test. Finally, the performance of the model is appraised using test data. Considering the maximum number of ARMA, seasonal auto regressive (SAR) and seasonal moving average (SMA) as 5, an example of the optimum achieved solution by ARIMACGA is provided in Fig. 2.
The objective function of the CGA is defined, in which all possible combinations are considered and the corrected Akaike information criterion (AICC) (Eq. 23) is employed to find the optimum model in terms of accuracy and simplicity simultaneously. The first term of the AICC indicates the accuracy of the model while the second one considers the complexity of the model.
where N is the number of samples, MSE is the mean square error and Comp. is the complexity of the model. The Comp. is the summation of stochastic models (p, q, P, Q) and constant term if it exists. The MSE is calculated as:
where MED_{obs,i} and MED_{p,i} are the ith value of the observed and predicted value (respectively). The flowchart of the preprocessing based stochastic model is presented in Fig. 3.
Generalized structure of group method of data handling (GMDH)
GMDH is a selforganized approach that gradually produces more complex models when evaluating the performance of the input and output datasets [32]. In this approach, the relationship between the input and output variables is expressed by the Volterra Series, which is similar to the Kolmogrov–Gabor polynomial:
where y is the output variable, A = (a_{0}, a_{1}, …, a_{m}) is the weights vector and X = (x_{1}, …, x_{N}) is the input variables vector. The GMDH model has been developed based on heuristic selforganization to overcome the complexities of multidimensional problems. This method first considers different neurons with two input variables and then specifies a threshold value to determine the variables that cannot reach the performance level. This procedure is a selforganizing algorithm.
The main purpose of the GMDH network is to construct a function in a feedforward network on the basis of a seconddegree transfer function. The number of layers and neurons within the hidden layers, the effective input variables and the optimal model structure are automatically determined with this algorithm.
In order to model using the GMDH algorithm, the entire dataset should first be divided into training and testing categories. After segmenting the data, it creates neurons with two inputs. Given that each neuron has only two inputs, all possible combinations for a model with n input vectors are as:
where NAPC is the number of all possible combinations and n is the number of input vectors.
According to the quadratic regression polynomial function, all neurons have two inputs and one output with the same structure, and each neuron with five weights (a_{1}, a_{2}, a_{3}, a_{4}, a_{5}) and one bias (a_{0}) executes the processing between the inputs (x_{i}, x_{j}) and output data as follows:
The unknown coefficients (a_{0}, a_{1}, a_{2}, a_{3}, a_{4}, a_{5}) are obtained by ordinary least squares. The performance of all neural network methods is heavily influenced by the chosen parameters. The unknown coefficients are calculated through a least squares solution as follows:
where A = {a_{0}, a_{1}, a_{2}, a_{3}, a_{4}, a_{5}} is the unknown coefficients vector, Y = {y_{1}, …, y_{N}}^{T} is the output vector and x is the input variable vector.
The AICC criterion (Eq. 23) is applied to determine the optimal network structure and select the neurons describing the target parameter. The Comp. in this equation for the GMDH model is defined as follows:
where Comp. is the complexity, NL is the number of layers and NTN is the number of total neurons.
The performance of classical GMDH in the modeling of nonlinear problems has been demonstrated in various studies [33,34,35,36]. However, along with its advantages, it possesses the following limitations: (i) secondorder polynomials, (ii) only two inputs for each neuron, (iii) inputs of each neuron can only be selected from the adjacent layer [37, 38]. In complex nonlinear problems, the necessity of using secondorder polynomials may impede an acceptable result. In addition, considering only two inputs per neuron and using adjacent layer neurons would result in a significant increase in the number of neurons (NN) [39].
In the current study, a new scheme of GMDH as a GSGMDH is employed and encoded in the MATLAB environment. The developed model removes all the mentioned disadvantages, so that each neuron can connect to two or three neurons at a time, taken from adjacent or nonadjacent layers. In addition, the order of polynomials can also be two or three. Similar to classical GMDH, the best structure is chosen based on the AICC index. According to the provided description, the developed GSGMDH can offer four modes: (1) secondorder polynomial with two inputs, (2) secondorder polynomial with three inputs, (3) thirdorder polynomial with two inputs, and (4) thirdorder polynomial with three inputs. The first mode is classical GMDH.
Figure 4 indicates an example of the developed GSGMDH for a model with five inputs and one output. In this figure, 3 different neurons (x_{11}, x_{12,} x_{21}) are presented to provide an equation to estimate the target parameter (y). The two neurons x_{11} and x_{21} have three inputs, which are the inputs of the desired problem. The x_{21} neuron, which is the output of the problem, has three inputs similar to the two previous neurons (x_{11} and x_{21}), except that it uses the nonadjacent layer neurons (x_{13}) in addition to the adjacent layer neurons (x_{11} and x_{21}).
The GSGMDH was used in this study to achieve the most precise results in forecasting the studied TS, which we abbreviated as MED data. GSGMDH is superior to the former method, GMDH, due to the random structure of neurons that is encoded in the genotype string that results in using all neurons from previous layers in subsequent layers. In addition, GSGMDH facilitates finding the minimized training and prediction errors separately, preventing model overtraining. The flow chart of the developed GSGMDH model is presented in Fig. 5.
Before starting the modeling using the GSGMDH method, some parameters must first be determined. The first parameter is the Maximum Number of Inputs (MNI) that determines the maximum number of inputs for individual neurons. It could be two or three. If set to three, both two and three inputs are tried. Inputs More (IM) is the other one that should be determined before starting modeling. It could be zero or one. If set to zero, the inputs of each neuron are considered only for previous layer while if IM is set to one, this results in taking input from the nonadjacent layers also. The Maximum Number of Neurons (MNN) is equal to the number of input variables, while it could be twice that number for complex problems. The polynomial degree (PD) could be considered to be two or three. If set to three, both two and three are allowed.
Combining linear and nonlinear models (datadriven method)
ITS consists of stochastic and deterministic components. Thus, by using appropriate data preprocessing methods, it is possible to reduce the problematic effects of deterministic components in the modeling process. The proposed methodology is based on a continuous modeling process. This datadriven method is based on preprocessing to identify linear and nonlinear components of ITS, verification of the validity of decomposed data, and the decomposed model. In the studied case (6), the ACE TS fluctuates greatly. The outcomes of the single stochastic and neural network modeling approaches are relatively weak. Hence, as a third approach, the ACE TS is modeled with a combined stochasticneural network model. Stochastic models perform efficiently, while TS are linear and do not contain deterministic terms that are responsible for nonlinearity. AI methods, on the other hand, allow the modeling of TS with nonlinear components. The TS, however, is not purely linear or nonlinear; both components are present simultaneously; the integration of which sometimes produces complex structures in the TS. In such cases, the use of single stochastic or nonlinear methods might be improved by a combined model. Combining stochastic models with AI methods is one of the most effective methods of modeling TS with complex structures. As shown in Fig. 6, the residuals of the stochastic models were used as a new TS in GSGMDH modeling, such that the features of both modeling approaches were utilized.
Verification indices to evaluate models
To verify the accuracy of modeling performed in the TS MED forecasting, the correlation coefficient (R), scatter index (SI), mean absolute percentage error (MAPE), root mean squared relative error (RMSRE) and performance index (ρ) are used. In addition to these indices, the corrected AICC and Nash–Sutcliffe model efficiency (E_{NS}) based on comparing the model's simplicity with the goodnessoffit and amount of deviation from the mean value [40] are used. The AICC index is used to find the best models in each TS modeling, and the lower the index value is the simpler the model. The E_{NS} index ranges from ∞ to 1, and the closer the index is to one, the more accurate the model.
where k is the number of parameters, N is the number of samples, σ_{ε}^{2} is the residuals’ standard deviation, E_{NS} is the Nash–Sutcliffe test statistic, and MED_{obs,i} and MED_{pred,i} are the ith value of actual data and forecasted MED, respectively.
The LjungBox test is used to check the independence of the residuals of the modeled TS [41]. The test statistic is calculated as follows:
where N is the number of samples, r_{h} is the residual coefficient of the auto regression (ε_{t}) in lag h, and the value of m is equal to ln(N). If the probability corresponding to the LjungBox test statistic in the χ2 distribution is higher than the αlevel (in this case P_{Q} > α = 0.05), the residual series is white noise and the model is adequate.
Results
Preprocessing tests
The values of the JB test show that the desired TS is distributed normally (p_{JB} = 66.29 > 0.05). Figure 7 indicates the ACF and the PACF of the main TS (standardized rate of ACEs TS), and data showed that there is a correlation up to three nonseasonal lags (the time period). Since the values of ACF are rapidly damped and are within the limit boundaries, there is no significant period or trend in the TS. However, to ensure this, the existence of deterministic terms and stationarity of the main TS was also evaluated using quantitative tests.
Table 1 provides the results of the quantitative test to evaluate the existence of deterministic terms, stationarity and normality of the main series, and detrended and differenced TS. The results of the nonseasonal and seasonal MK tests show that the pvalues of MK and seasonal MK are 0.02 and 0.24 respectively. Therefore, the ACE TS has a nonseasonal trend (p_{MK} = 0.02 is less than critical value, 0.05). Hence, the trend must be removed. Moreover, the pvalue of the Fisher test indicates that the TS has a period (p_{Fisher} = 5.85) greater than the critical value 3. According to the Fisher test, the severity of the period is not very high as the value is close to critical, then minor. Moreover, the MW test proves there is no jump in the TS (p_{MW} = 2.78, higher than the acceptable value 0.05). The KPSS test also indicates the TS is nonstationary (p_{KPSS} = 0.19, higher than the acceptable value 0.05), which is because of the trend and period detected in the TS.
To remove the deterministic terms, two scenarios are defined: detrending and stationarizing the ACE TS by differencing before stochastic modeling. The linear trend line is obtained as follows: trend line = 0.4792 × t + 201.72.
After eliminating the linear trend from the main TS, all of the deterministic factors are removed. Indeed, the detrended TS is stationary with no deterministic term. Similar to the main TS, this has a normal distribution. Consequently, the detrended TS is modeled with the ARMA. To find the parameters of the ARMA model, ACF and PACF diagrams are employed. As shown in Fig. 8, it is obvious there is still a correlation to three nonseasonal lags. Therefore, p and q in ARMA(p,q) are considered as p,q = {0,1,2,3}. For the purpose of determining more accurate and simpler models, the value of these parameters is considered 10.
In the second scenario, differencing the main TS is proposed (differenced TS) to remove the deterministic terms. The findings in Table 1 in which MW, MK, SMK and JB increases are higher than 0.05, as well as the Fisher test higher than 3, indicate the differenced TS results in an increasing of the period in the new TS. Although the Fisher test exhibits growth in periodicity, the stationarity of the differenced TS increases considerably; thus, enabling the modeling of the TS. Furthermore, the differencing method has considerable impact on the correlation of the lags and decreases them markedly. Hence, an ARIMA model could be employed with fewer parameters and subsequently less error. The ACF and PACF of the differenced TS (Fig. 9) indicate that the values of p and q in this state are lower than the ARMA model.
Stochastic modeling
TS modeling, however, offers numerous combinations of previous lags from which to select the most appropriate TS input combination. Therefore, applying suitable preprocessing should lead to determining and selecting the most effective lag for modeling. According to the ACF plots for both preprocessed TS and test results, a maximum of three parameter orders are required for ARMA modeling and one for ARIMA modeling (Figs. 7, 8, 9). For modeling, the first 50 data were considered for the training stage and the remainder (nine data) for the testing stage. The stochasticbased linear modeling results are presented in Table 2. As the results in this table indicate, both linear models are relatively weak in modeling the ACE TS. The ARMA model outperforms ARIMA and the results are marginally better than ARIMA. The ARMA model with seven nonseasonal autoregressive parameters and five nonseasonal moving average parameters modeled the ACE TS with R = 77.95%, SI = 3.46%, MAPE = 2.89%, RMSRE = 3.54%, E_{NS} = 0.66 and AICC = −15.84 in testing. The ARIMA model also performed slightly weaker than ARMA with R = 73.74%, SI = 4.21%, MAPE = 2.89%, RMSRE = 4.37, E_{NS} = 0.50 and AICC = −5.55 in testing. The LjungBox results for ARIMA and ARMA models are provided in Fig. 10. The test is done for the first 47 lags of the training part and the eight lags of the test part separately (n1 data are considered for testing). It is observed that the residuals of both linear models are independent and the white noise and modeling are adequate and correct. Figure 11 demonstrates scatter plots of both ARMA and ARIMA models in testing and training versus the observed data. According to this figure, the majority of forecasted data are located within 5% intervals.
Generalized structure group method of data handling (GSGMDH)
As mentioned earlier, AI methods are widely utilized for data simulation and forecasting. Each TS consists of two parts: linear and nonlinear. The stochastic models, also known as linear models, are able to model the linear part of the TS; hence, the nonlinearity is removed from the TS prior to modeling with prepreprocessing methods. Conversely, AI models are known for their ability in modeling the nonlinear part. The neural network applied to the ACE TS under study is the GSGMDH model, which is enhanced by the genetic algorithm. In this GSGMDH, it is allowed to randomly apply crossover and mutation for the whole length of the chromosome string. Neurons are used for all layers and by calculating the errors separately; both training and testing sets have low errors. The results of this method are presented in Table 3. According to the results, the model was able to forecast the original TS without preprocessing with R = 82.35%, SI = 4.22%, MAPE = 3.16%, RMSRE = 4.25% and ρ = 2.33% for the training data and R = 89.35%, SI = 2.60%, MAPE = 2.10%, RMSRE = 2.66% and ρ = 2.33% for the testing data. As the scatter plot for both training and testing data in Fig. 12 indicates, the majority of forecasted data in the testing period have less than 5% error and are located within the intervals. The AI method employed allowed forecasting of the nonlinearity in the ACE TS very well. Though the GSGMDH method performed better than the single ARIMA and ARMA models, with a mean growth of 11.63% in correlation, these results are relatively close to the stochastic models. Therefore, a complementary method is required.
Combined datadriven modeling
As mentioned in previous sections, each model has certain specifications. Stochastic models perform efficiently, while TS are linear and do not contain deterministic terms that are responsible for nonlinearity. AI methods, on the other hand, allow the modeling of TS with nonlinear components. The TS, however, is not purely linear or nonlinear. Both components are present simultaneously, the integration of which sometimes creates complex structures. In such cases, employing single stochastic or nonlinear methods does not provide acceptable results. Therefore, alternative solutions are required to resolve this problem. Hybridizing stochastic models with AI methods is one of the most viable methods of modeling TS with complex structures. In the studied case, the ACE TS fluctuates greatly. The outcomes of the single stochastic and neural network modeling approaches are relatively weak. Thus, as a third approach, the ACE TS is modeled with a combined stochasticneural network model. The hybrid model results are provided in Table 4.
It is apparent from the information supplied in Table 4 that the correlation between the modeled and observed data is rising. The R exhibited an average of 17.74% enhancement from the single stochastic model and 2.05% from the single GSGMDH model. Although the results are slightly better than the single GSGMDH model, model accuracy improved and in fact, the errors are about half those of the linear model. The ARMAGSGMDH model with R = 91.03%, SI = 2.52%, MAPE = 2.13%, RMSRE = 2.47% and ρ = 1.29% outperformed the ARIMA–GSGMDH model with R = 91.91%, SI = 2.86%, RMSRE = 2.99% and ρ = 1.56% as well as all other models. Figure 13 demonstrates the scatter plot of the hybrid modeling results, where almost all forecasted data are within the ± 5% error interval. Figures 14 and 15 provide a good comparison between the observed MED data and the models. The box plot (Fig. 4a) shows that the hybrid model forecasted the interquartile area, mean and median of the data better than other models. However, the maximum and minimum predictions varied between the models (Fig. 4b). The superiority of the ARIMA–GSGMDH model is demonstrated by the model’s maximum, minimum and interquartile areas, which are much closer to the observed data than all other models, especially the regression model used in the Bernal et al. [6] study.
The Taylor diagram [42] investigates the performance of the models using the standard deviation (SD) and R of all the tested models simultaneously. The distance from any point to the observed data in the diagram is equivalent to the centered RMSE and a precise model is one with a coefficient of determination of 1 and SD similar to the observed data. [43, 44] As illustrated in Fig. 16, the sample ITS models, including combined datadriven modeling (ARMA–GSGMDH and ARIMA–GSGMDH); showed a superior performance to models in the Bernal et al. study [6]. Both datadriven models were situated closer to the reference (observed) point than the models alone (GSGMDH, ARMA and ARIMA). ARMA–GSGMDH has a lower SD and higher R. By applying a combined model, the difference between the model and observed data is decreased and accuracy of predicted results is increased (Table 4, Fig. 13) in both training and testing stages. The R of the proposed combined datadriven model (ARMA–GSGMDH and ARIMA–GSGMDH) exhibited an average of 17.74% enhancement from the single stochastic model (ARMA and ARIMA) and 2.05% from the nonlinear model (GSGMDH). Although the results of combined approaches are slightly better than the single GSGMDH model, the accuracy is improved and the errors were about half those of the linear model.
As illustrated in Fig. 16, compared to the regression model of the Bernal et al. study [6], the single stochastic ARMA model and ARIMA have almost the same location in the diagram, in addition to showing a relatively higher RMSE than the single GSGMDH and the combined model. The plot in Fig. 16 showed the superiority of the combined ARMA–GSGMDH model with the observed ACE data [25] and the regression model [6]. Moreover, by combining the features of both models (ARMA and GSGMDH), the fluctuations in the ACE TS could be better predicted. The series has severe fluctuations, which is why linear models alone cannot adequately forecast the data (Fig. 16). Hence, data (Table 5) showed that the combined model improved the results of the linear regression. The statistical indices indicate that the linear regression model has lower accuracy (R being 11.83% lower) and higher errors (SI, 1.33%; MAPE, 1.25%; and RMSRE, 1.19%) than the proposed model. Index ρ can be employed for measuring model error in addition to examining the correlation between the model and observational values. This index is lower for the ARMA–GSGMDH model (ρ = 1.86%) compared to the Bernal et al. [6] model (ρ = 2.66%). Moreover, the Nash–Sutcliffe coefficient (E_{N–S}), which is an index showing a model’s weakness in forecasting extreme values, revealed an E_{N–S} = 0.58 for the regression model which is considerably lower than the combined model E_{N–S} = 0.78.
Discussion
This study provides a novel approach on the use of ITS modeling based on the continuous translational data driven approach. To validate the developed model, we assessed the effects of the Italian smoking ban in public areas on hospital admissions for acute coronary events. We propose a hybrid methodology using a continuous translational datadriven approach based on a combination of the stochastic and AI methods that will (i) increase the accuracy of prediction results through a continuous modeling process, and (ii) importantly will solve a challenging issue in ITS modeling regarding the time lag between pre and postintervention periods, which limits the application of the segmented regression method in ITS modeling.
The complex dynamic behavior of the ACE can be modeled with a TS approach, which deduces the characteristics of the data generation process by analyzing historical data. In a recent study, Bonakdari et al. [24] showed that future prevalence a complex heath care outcome can be evaluated by historical TS at a specific time. As different dependent parameters can have a serious impact on outcome, relevant information regarding the ACE was extracted based on historical data summarized as internal patterns. In this study, the ACE TS was modeled using linearbased stochastic model (ARMA, ARIMA), nonlinearbased GSGMDH and an integration of a continuous linear (stochastic) with a nonlinear model (datadriven method). Two fundamental premises for stochastic modeling were stationarity and normal distribution. In order to achieve stationarity, the deterministic terms should be removed from the TS. For this purpose, the structure of the TS was investigated by different tests. Initially, the ACE TS structure was investigated by stationarity and normality tests. Data showed that the TS was normally distributed but was not stationary (Table 1). The deterministic term(s) responsible for nonstationarity (trend, jump and period) terms were performed and trend and jump were found in the series. Detrending the ACE TS by trend analysis was done by stationarizing the data and then by differencing the detrended data. The former surprisingly eliminated all deterministic terms and stationarized the TS very well by 44.55% (Table 1). The latter improved the stationarity and removed the linear trend completely, made some fluctuations in the TS and increased the Fisher statistic parameter. Nonetheless, the preprocessed ACE TS was completely stationary and normal. The ARMA and ARIMA models were the first applied to the series. In order to determine the order of the models, ACF plots were used and a maximum of three parameters were required. For further investigation, ten parameters were considered in modeling. The ARMA model with seven nonseasonal autoregressive parameters and five nonseasonal moving average parameters in the testing period outperformed the ARIMA model. For the second TS modeling approach, the ACE TS data was modeled by GSGMDH. The most important feature of these models is their ability to model nonlinearity better than linear stochastic models. The results showed that the single nonlinear model improved the accuracy of GSGMDH. In the third and final step, a combination of linear and nonlinear models was made. As the results depicted, both ARMA–GSGMDH and ARIMA–GSGMDH outperformed the single models. The ARMA–GSGMDH model enhanced the results by an average of 17.74 and 2.03% compared to the single linear and nonlinear models. As illustrated in the Taylor diagram, combined models have a higher R to observed data, lower RMSE and SD closer to the observed data than other models, thus better fitting the observed data.
The proposed methodology, as well as ITS modeling, can be employed for TS prediction. To verify the performance of the methodology in TS data set modeling, another health care real case was assessed. Bhaskaran et al. [45] used TS modeling in environmental epidemiology. They studied the association between ozone levels and the total number of deaths in the city of London (UK) for a time period of five years from 1 January 2002 to 31 December 2006. In brief, the authors [45] investigated three alternative techniques including time stratified model, periodic functions, and flexible spline functions to shed light on key considerations for modeling long term patterns of studied TS. Their prediction for the total number of deaths as TS outcomes yielded a coefficient of R = 0.71, 0.65 and 0.69 for each method, respectively. When applying the present developed methodology to their dataset, data from the hybrid model (ARMA–GSGMDH) give more accurate results in which R = 0.75 for total number of deaths. These confirm not only that the proposed hybrid model is able to predict ITS outcomes (no need to identify the implemented intervention on outcomes), but it also can be employed for modeling TS with high accuracy compared to conventional approaches.
As detailed by Bonakdari et al. [46], conventional analysis of ITS in healthcare is based on regression methods that highly depend on intervention lag time which is very often difficult to determine. However, the present methodology can continuously be employed in such cases. As examples, the hybrid model could also be applied to several health conditions and include to analyze the relationship between smoking bans and the incidence of acute myocardial infarction [47]; to analyze the quality improvement strategy on the rate of being uptodate with pneumococcal vaccination [48]; to assess the impact of health information technology initiatives on the performance of rheumatoid arthritis disease activity measures and outcomes [16], to name a few.
As all studies, there are limitations of the hybrid methodology and mostly associated with stochastic and/or nonlinear models. The most important limitation of such a hybrid method is the minimum length of outcome TS dataset needed in the training stage. In addition, selecting appropriate parameters of stochastic models in some cases requires increasing stationarization steps which could lead to differencing, seasonal standardization, and spectral analysis methods. In turn, selecting the best input combination in nonlinear models could also be a challenging task. Finally, designing AI architecture for a given ITS requires several trial and error steps to find the appropriate parameters.
Conclusions
Our study suggested that the proposed continuous translational datadriven model not only predicts ACEs with high accuracy and improved ITS prediction compared to current regression methods, but importantly, does not require any predefined lag time between pre and postintervention. This methodology can therefore be used as a reliable alternative in public health intervention evaluation. Hence, the novel hybrid approach provides a step forward by facilitating the modeling of such assessments in a short time. This is important for decision makers to manage health conditions as complex adaptive systems in a timely manner.
Availability of data and materials
Not relevant.
Abbreviations
 ACEs:

Acute coronary events
 ACF:

Autocorrelation function
 AI:

Artificial intelligence
 AICC:

Akaike information criterion
 AR:

Autoregressive
 ARMA:

Autoregressive moving average
 ARIMA:

Autoregressive integrated moving average
 CGA:

Continuous genetic algorithm
 GMDH:

Group method of data handling
 GSGMDH:

Generalized structure group method of data handling
 IM:

Inputs more
 ITS:

Interrupted time series
 MA:

Moving average
 TS:

Time series
 KPSS:

Kwiatkowski–Phillips–Schmidt–Shin
 JB:

Jarque–Bera
 MK:

Mann–Kendall
 MAPE:

Mean absolute percentage error
 MNI:

Maximum number of inputs
 MNN:

Maximum number of neurons
 MW:

Mann–Whitney
 NN:

Number of neurons
 PD:

Polynomial degree
 RMSRE:

Root mean squared relative error
 SAR:

Seasonal auto regressive
 SD:

Standard deviation
 SMA:

Seasonal moving average
 SMK:

Seasonal Mann–Kendall
References
Bonakdari H, Moeeni H, Ebtehaj I, Zeynoddin M, Mahoammadian A, Gharabaghi B. New insights into soil temperature time series modeling: linear or nonlinear? Theor Appl Climat. 2019;135(3–4):1157–77.
Moeeni H, Bonakdari H, Ebtehaj I. Integrated SARIMA with neurofuzzy systems and neural networks for monthly inflow prediction. Water Resour Manag. 2017;31:2141–56.
Ebtehaj I, Bonakdari H, Zeynoddin M, Gharabaghi B, Azari A. Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models. Int J Environ Sci Technol. 2020;17:505–24.
Gustafsson NK, Ramstedt MR. Changes in alcoholrelated harm in Sweden after increasing alcohol import quotas and a Danish tax decrease—an interrupted timeseries analysis for 2000–2007. Int J Epidemiol. 2011;40(2):432–40.
Corcoran P, Griffin E, Arensman E, Fitzgerald AP, Perry IJ. Impact of the economic recession and subsequent austerity on suicide and selfharm in Ireland: an interrupted time series analysis. Int J Epidemiol. 2015;44(3):969–77.
Bernal JL, Cummins S, Gasparrini A. Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol. 2017;46(1):348–55.
Bernal JL, Cummins S, Gasparrini A. The use of controls in interrupted time series studies of public health interventions. Int J Epidemiol. 2018;47(6):2082–93.
Laverty AA, Kypridemos C, Seferidi P, Vamos EP, PearsonStuttard J, Collins B, et al. Quantifying the impact of the Public Health Responsibility Deal on salt intake, cardiovascular disease and gastric cancer burdens: interrupted time series and microsimulation study. J Epidemiol Commun Health. 2019;73(9):881–7.
Garriga C, Murphy J, Leal J, Price A, PrietoAlhambra D, Carr A, et al. Impact of a national enhanced recovery after surgery programme on patient outcomes of primary total knee replacement: an interrupted time series analysis from “The National Joint Registry of England, Wales, Northern Ireland and the Isle of Man.” Osteoarthr Cartil. 2019;27(9):1280–93.
Zhu D, Shi X, Nicholas S, Bai Q, He P. Impact of China’s healthcare price reforms on traditional Chinese medicine public hospitals in Beijing: an interrupted timeseries study. BMJ Open. 2019;9(8):e029646.
Wagner AK, Soumerai SB, Zhang F, RossDegnan D. Segmented regression analysis of interrupted time series studies in medication use research. J Clin Pharm Ther. 2002;27(4):299–309.
Penfold R, Zhang F. Use of interrupted time series analysis in evaluating health care quality improvements. Acad Pediatr. 2013;13(6 Suppl):S3844.
Judge A, Wallace G, PrietoAlhambra D, Arden NK, Edwards CJ. Can the publication of guidelines change the management of early rheumatoid arthritis? An interrupted time series analysis from the United Kingdom. Rheumatology. 2015;54(12):2244–8.
Linden A. Conducting interrupted timeseries analysis for singleand multiplegroup comparisons. Stata J. 2015;15(2):480–500.
Cordtz RL, Hawley S, PrietoAlhambra D, Hojgaard P, Zobbe K, Overgaard S, et al. Incidence of hip and knee replacement in patients with rheumatoid arthritis following the introduction of biological DMARDs: an interrupted timeseries analysis using nationwide Danish healthcare registers. Ann Rheum Dis. 2018;77(5):684–9.
Gandrup J, Li J, Izadi Z, Gianfrancesco M, Ellingsen T, Yazdany J, et al. Three quality improvement initiatives improved performance of rheumatoid arthritis disease activity measures in electronic health records: results from an interrupted time series study. Arthritis Care Res (Hoboken). 2019;72(2):283–91.
Hawley S, Ali MS, Berencsi K, Judge A, PrietoAlhambra D. Sample size and power considerations for ordinary least squares interrupted time series analysis: a simulation study. Clin Epidemiol. 2019;11:197–205.
Arsalan M, Qureshi AS, A. K, Rajarajan M, . Protection of medical images and patient related information in healthcare: using an intelligent and reversible watermarking technique. Appl Soft Comput. 2017;51:168–79.
Ignatov A. Realtime human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput. 2018;62:915–22.
Krishnan GS, Kamath SS. A novel GAELM model for patientspecific mortality prediction over largescale lab event data. Appl Soft Comput. 2019;80:522–33.
Selvaraj A, Patan R, Gandomi AH, Deverajan GG, Pushparaj M. Optimal virtual machine selection for anomaly detection using a swarm intelligence approach. Appl Soft Comput. 2019;84:105686.
Liu M, Zhou M, Zhang T, Xiong N. Semisupervised learning quantization algorithm with deep features for motor imagery EEG Recognition in smart healthcare application. Appl Soft Comput. 2020;89:106071.
Aladeemy M, Adwan L, Booth A, Khasawneh MT, Poranki S. New feature selection methods based on oppositionbased learning and selfadaptive cohort intelligence for predicting patient noshows. Appl Soft Comput. 2020;86:105866.
Bonakdari H, Pelletier JP, MartelPelletier J. A reliable timeseries method for predicting arthritic disease outcomes: new step from regression toward a nonlinear artificial intelligence method. Comput Methods Programs Biomed. 2020a;189:105315.
BaroneAdesi F, Gasparrini A, Vizzini L, Merletti F, Richiardi L. Effects of Italian smoking regulation on rates of hospital admission for acute coronary events: a countrywide study. PLoS ONE. 2011;6(3):e17419.
Hirsch RM, Slack JR. A nonparametric trend test for seasonal data with serial dependence. Water Resour Res. 1984;20(6):727–32.
Zeynoddin M, Bonakdari H, Ebtehaj I, Esmaeilbeiki F, Gharabaghi A, Haghi DZ. A reliable linear stochastic daily soil temperature forecast model. Soil Tillage Res. 2019;189:73–87.
Kashyap RL, Rao AR. Dynamic stochastic models from empirical data. New York: Academic Press; 1976.
Kwiatkowski D, Phillips CB, Schmidt P, Yongcheol S. Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J Econom. 1992;54(1–3):159–78.
Jarque CM, Bera AK. Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ Lett. 1980;6(3):255–9.
Cryer JD, Chan KS. Time series analysis. New York: Springer; 2008.
Ivakhnenko AG. Polynomial theory of complex systems. IEEE Trans Syst Man Cybern Syst. 1971;1(4):364–78.
Sharma N, Om H. GMDH polynomial and RBF neural network for oral cancer classification. Netw Model Anal Health Inform Bioinform. 2015;4(1):1–10.
Mo L, Xie L, Jiang X, Teng G, Xu L, Xiao J. GMDHbased hybrid model for container throughput forecasting: selective combination forecasting in nonlinear subseries. Appl Soft Comput. 2018;62:478–90.
Novikov M. Multiparametric quantitative and texture ^{18}FFDG PET/CT analysis for primary malignant tumour grade differentiation. Eur Radiol Exp. 2019;3:48.
Oprea M. A general framework and guidelines for benchmarking computational intelligence algorithms applied to forecasting problems derived from an application domainoriented survey. Appl Soft Comput. 2020;89:106103.
Walton R, Binns A, Bonakdari H, Ebtehaj I, Gharabaghi B. Estimating 2year flood flows using the generalized structure of the Group Method of Data Handling. J Hydrol. 2019;575:671–89.
Safari MJS, Ebtehaj I, Bonakdari H, Eshaghi MS. Sediment transport modeling in rigid boundary open channels using generalize structure of group method of data handling. J Hydrol. 2019;577:123951.
Bonakdari H, Ebtehaj I, Gharabaghi B, Vafaeifard M, Akhbari A. Calculating the energy consumption of electrocoagulation using a generalized structure group method of data handling integrated with a genetic algorithm and singular value decomposition. Clean Technol Environ Policy. 2019;21(2):379–93.
Burnham KP, Anderson DR. Model selection and multimodel inference: a practical informationtheoretic approach. 2nd ed. NEW York: Springer; 2002.
Ljung GM, Box GEP. On a measure of lack of fit in time series models. Biometrika. 1978;65(2):297–303.
Taylor KE. Summarizing multiple aspects of model performance in a single diagram. J Geophys ResAtmos. 2001;106:7183–92.
Heo KY, Ha KK, Yun KS, Lee SS, Kim HJ, Wang B. Methods for uncertainty assessment of climate models and model predictions over East Asia. Int J Climatol. 2014;34(2):377–90.
Zeynoddin M, Bonakdari H, Azari A, Ebtehaj I, Gharabaghi B, Riahi MH. Novel hybrid linear stochastic with nonlinear extreme learning machine methods for forecasting monthly rainfall a tropical climate. J Environ Manage. 2018;222:190–206.
Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. Int J Epidemiol. 2013;42(4):1187–95.
Bonakdari H, Pelletier JP, MartelPelletier J. Viewpoint on time series and interrupted time series optimum modeling for predicting arthritic disease outcomes. Curr Rheumatol Rep. 2020b;22(7):27.
Gasparrini A, Gorini G, Barchielli A. On the relationship between smoking bans and incidence of acute myocardial infarction. Eur J Epidemiol. 2009;24(10):597–602.
Desai SP, Lu B, SzentGyorgyi LE, Bogdanova AA, Turchin A, Weinblatt M, et al. Increasing pneumococcal vaccination for immunosuppressed patients: a cluster quality improvement trial. Arthritis Rheum. 2013;65(1):39–47.
Acknowledgements
The authors would like to thank Santa Fiori for her assistance in the manuscript preparation.
Funding
This work was supported in part by the Osteoarthritis Research Unit of the University of Montreal Hospital Research Centre (CRCHUM) and the Chair in Osteoarthritis of the University of Montreal, Montreal, Quebec, Canada. The funding sources had no role in the study design, the collection, analysis and interpretation of data the writing of the report; nor in the decision to submit the article for publication.
Author information
Authors and Affiliations
Contributions
HB designed the study and performed the analysis. HB, JPP and JMP interpreted the results and drafted the article. All authors critically revised the article for intellectual content and approved the submitted version of the article. All authors are accountable for the accuracy and integrity of the work.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not relevant.
Consent for publication
Not relevant.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Bonakdari, H., Pelletier, JP. & MartelPelletier, J. A continuous data driven translational model to evaluate effectiveness of populationlevel health interventions: case study, smoking ban in public places on hospital admissions for acute coronary events. J Transl Med 18, 466 (2020). https://doi.org/10.1186/s1296702002628x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1296702002628x
Keywords
 Transitional model
 Data processing
 Computer simulation
 Hybrid model
 Interrupted time series
 Lag time
 Nonlinear
 Public health