Category Archives: Study

Incidence Rates and Deaths of Tuberculosis in HIV-Negative Patients in the United States and Germany as Analyzed by New Predictive Model for Infection


Incidence and mortality due to tuberculosis (TB) have been decreasing worldwide. Given that TB is a cosmopolitan disease, proper surveillance and evaluation are critical for controlling dissemination. Herein, mathematical modeling was performed in order to: 1) demonstrate a correlation between the incidence of TB in HIV-free patients in the US and Germany, and their corresponding mortality rates; 2) show the utility of the newly developed D-R algorithm for analyzing and predicting the incidence of TB in both countries; and 3) inform us on population death rates due to TB in HIV-negative patients. Using data published by the World Health Organization between 1990 and 2009, the relationship between incidence and mortality that could not be ascribed to HIV infection was evaluated. Using linear, quadratic and cubic curves, we found that a cubic function provided the best fit with the data in both the US (Y = 2.3588+2.2459X+61.1639X2−60.104X3) and Germany (Y = 1.9271+9.4967X+18.3824X2−10.350X3) where the correlation coefficient (R) between incidence and mortality was 0.995 and 0.993, respectively. Second, we demonstrated that fitted curves using the D-R model were equal to or better than those generated using the GM(1,1) algorithm as exemplified in the relative values for Sum of Squares of Error, Relative Standard Error, Mean Absolute Deviation, Average Relative Error, and Mean Absolute Percentage Error. Finally, future trends using both the D-R and the classic GM(1,1) models predicted a continued decline in infection and mortality rates of TB in HIV-negative patients rates extending to 2015 assuming no changes to diagnosis or treatment regimens are enacted.









Predictive values of the D-R and GM(1,1) algorithms

From the 2011 WHO report on TB, the actual values for I-U, I-G, D-U and D-G were 4.1, 4.8, 0.18, and 0.25, respectively. The values predicted using the D-R model were 3.6, 4.7, 0.12 and 0.22, respectively. Calculations for the indexes SSE, RSE, MAD, ARE and MAPE derived from the D-R and GM(1,1) algorithms are summarized in Table 3. In general, the values for the D-R model are equal to or less than those generated using the GM(1,1) model suggesting that the D-R algorithm was equal to or better than the GM(1,1) model for predicting trends in TB infection and mortality. These results held true for data derived from both the US and for Germany.

Table 3. Calculation of key accuracy indexes spanning 1994–2009 evaluating the predictive quality of the D-R and GM(1,1) models.



Related coefficient analyses (R) indicated that the incidence of TB and deaths due to TB of HIV-negative patients were closely related in the both countries i.e. U = 0.997, p<0.01 and G = 0.993, p<0.01. Comparisons among different simulation approaches showed that cubic parametric functions generated the best fit between the incidence and mortality in TB-infected, HIV-negative patients.Actual data clearly showed that the incidence and mortality of TB in HIV-negative patients in the US and Germany decreased during the period 1990–2009. In predicting this trend using existing datasets, the curves generated using the D-R model were equal to or better than those generated by the GM(1,1) model with respect to coinciding with actual datasets except for the last available dataset i.e. 2010. It is possible that this difference lies in the fact that the 2010 dataset included HIV-positive patients in the reporting of incidence. As such it was not included in our consideration of incidence for 2010. In addition, in those instances where rapid points of inflection appeared in the actual data, the self-adaptation component of the D-R model was able to correct within one data point and in stark contrast to the curves generated with the GM(1,1) algorithm. The D-R model consists of weighting first order differential of the original data, weighting the short-and long-term trends, and then summing these with the original data to arrive at a predicted value. The advantage of this approach is that regardless of how the original data may change, the model can adapt to this change resulting in little deviation between the simulated and actual values.

In contrast, the GM(1,1) model mainly relies on summing or totting-up the original data. In other words, rather than summing the arithmetic series i.e., 1+2+3+4 etc., the GM(1,1) model uses accumulated data or a progressive total such as in the series 1,(1+2) 3,(1+2+3) 6, (1+2+3+4), 10,(1+2+3+4+5) 15 etc. to arrive at the predicted values. As the series increases, the totting-up values also increase which in turn reduces the flexibility of the GM(1,1) model. The D-R model has no such drawbacks. In general, we attribute the variability observed in the earlier years (1994–1996) to insufficient data (1990–1993) for simulating that part of the curve. These inconsistencies were accounted for as the numbers of years incorporated in the analysis increased.

As we bring out in this study and confirmed in previous analyses (11), the D-R model can self-adapt. This allows the model to have a high level of flexibility and generate reliable future trends with limited i.e., incomplete, data. We hope to further improve this model by incorporating and weighting a delay operator (a dataset encompassing the differences between the fitted values and the actual values to increase the fitting accuracy of the model) and by adding a limit algorithm to increase the model’s predictive character.

Herein we stringently compared our model to the GM(1,1) model because of the wide use of the GM(1,1) model in these types of analyses; however, other models are also commonly used for time series data such as, ARMA and ARIMA. These time series algorithms are more suited for data with high periodicity. For this specific purpose, they may be better than the D-R model. However, when predicting trends involving general increases and/or decreases in the absence of recurring changes such as what occurs with TB and drug-resistance, these models are inferior to the D-R model.

By comparing predicted and actual values, we can conclude that the predictive values of the D-R model were accurate and feasible. In like manner, the predicative values of GM(1,1) model were also accurate and can be used as a reference. Both the actual values and the values predicted by D-R and GM(1,1) models indicated that under the current policies and prevention methods, I-U, I-G, D-U and D-G decreased over the period of the analysis where the trend in D-G was the most significant.

Comparing our method and that used by the WHO i.e., log-linear model, we have concluded that the D-R model is better suited for simulating this data. First, the log-linear model is commonly used for examining “interactions” or the influence between two data sets or parameters, rather than for simulating and predicting future trends derived from one data set. In this manuscript, we first calculated the correlation coefficient, and then performed linear fitting, then curve fitting and additional curve fitting. Finally, we compared the various fitted curves to obtain the best equation that describes the occurrence and death rates of TB and their relationship to one another in US and in Germany. We feel the log-linear model would have been more appropriate had we been interested in comparing the data from Germany and the US with respect to the effects of intervention strategies, for example, rather than predicting future trends. As such, we feel the D-R model is more suitable than the log-linear model for analyzing future trends. Other advantages of using either the D-R or GM models over the log-linear model are the ability to generate exact fitting results and the fact that the D-R or the GM models which are designed for predicting future trends would be helpful for policy formulation and disease control.

We refrained from a specific one-on-one comparison of our model with that used to generate the most recent WHO reports because there was no way to guarantee that our datasets would be 100% congruent. As example, we culled TB patients that harbored HIV infections. Attempting to re-analyze the data used in our report using the log-linear model and generating results not consistent with the WHO report would also bring into question the data sources. As such, this was not done.

The efficiency of the D-R algorithm to mirror actual data was further confirmed and supported by accuracy indexes as defined here by SSE, RSE and MAD which were lower or equal to those generated by the GM(1,1) model. We also evaluated ARE and MAPE which derive their comparisons based on percent differences rather than concrete numbers and as such may be more representative of the differences between the two algorithms. In this regard, the values obtained using the D-R model were consistently lower suggesting a better accuracy than those derived using the GM(1,1) model. Of particular interest are future predictions using D-R model which shows that the incidence and mortality due to TB of HIV-negative patients should continue to decrease in both countries through 2015 (see Table 1). Using SPSS software, we tested the four groups of data by the chi-square trend test. Our results showed that all data sets were statistically-significant (P<0.001) and generated the following values; I-U = 19.834, I-G = 18.979, D-U = 20.028, and D-G = 18.141. The higher value in the D-U data relative to that from D-G suggests a more pronounced reduction in the US death rate due to TB. The results imply that the current control measures have been effective. These data also suggest that D-U will decrease more significantly if the trends continue. As such, we predict that by 2015, the mortality could drop to 1%(0.01)in the US.

Clearly, the ability of our model to “self-correct” needs to be more exhaustively tested and for this reason we evaluated the data related TB in the US and Germany. Certainly, no one model can predict far into the future given that population interactions and environments are continuously changing and can never fully be accounted for. However, if small inflection points can be identified early on and a flexible model is available that can alter its predictive character for the near term as well as the intermediate future, such an algorithm could be a tremendous asset to policy makers, governments and the research and medical communities who have control over short term intervention strategies. We believe the D-R model in one such algorithm with this potential.

In conclusion, we used a cubic parametric equation for the first time to show a relationship between incidence and mortality among HIV-negative TB-infected patients in America and Germany with the hope of using this as a benchmark for predicting future changes. Furthermore, such data allows us to link the incidence of TB to mortality and from this propose that prevention needs to receive more attention as a key approach to reducing mortality rates. In addition, we showed that the D-R model closely mirrored the trend line of TB through 2009 and from this, predicted changes to the infection and mortality rates through 2015.

Author Contributions

Conceived and designed the experiments: YR FD SS REB XR. Performed the experiments: FD YR XR. Analyzed the data: XR YR FD DZ. Contributed reagents/materials/analysis tools: FD DZ YR XR. Wrote the paper: YR FD DZ XR.


  1. 1. WHO website. Global tuberculosis control (2011) Available: // Accessed 2012 August 21.
  2. 2. Rylance J, Pai M, Lienhardt C, Garner P (2010) Priorities for tuberculosis research: a systematic review. The Lancet Infectious Diseases 10: 886–892.
  3. 3. Nathanson E, Nunn P, Uplekar M, Floyd K, Jaramillo E, et al. (2010) MDR tuberculosis–critical steps for prevention and control. N Engl J Med 363: 1050–1058.
  4. 4. Keshavjee S, Farmer PE (2010) Picking up the pace–scale-up of MDR tuberculosis treatment programs. N Engl J Med 363: 1781–1784.
  5. 5. Gardy JL, Johnston JC, Ho Sui SJ, Cook VJ, Shah L, et al. (2011) Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N Engl J Med 364: 730–739.
  6. 6. Sun YJ, Luo JT, Wong SY, Lee AS (2010) Analysis of rpsL and rrs mutations in Beijing and non-Beijing streptomycin-resistant Mycobacterium tuberculosis isolates from Singapore. Clin Microbiol Infect 16: 287–289.
  7. 7. Teo SS, Alfaham M, Evans MR, Watson JM, Riordan A, et al. (2009) An evaluation of the completeness of reporting of childhood tuberculosis. Eur Respir J 34: 176–179.
  8. 8. Alvarez GG, Gushulak B, Abu Rumman K, Altpeter E, Chemtob D, et al. (2011) A comparative examination of tuberculosis immigration medical screening programs from selected countries with high immigration and low tuberculosis incidence rates. BMC Infect Dis 11: 3.
  9. 9. Lieberman R, Kwong H, Liu B, Huang H (2009) Computer-assisted detection (CAD) methodology for early detection of response to pharmaceutical therapy in tuberculosis patients. Proc Soc Photo Opt Instrum Eng 7260: 726030.
  10. 10. WHO website. Global Health Observatory (GHO). Available: // Accessed 2012 August 21.
  11. 11. Ding F, Zarlenga DS, Qin C, Ren X (2011) A novel algorithm to define infection tendencies in H1N1 cases in Mainland China. Infect Genet Evol 11: 222–226.
  12. 12. Ding F, Zarlenga DS, Ren Y, Li G, Luan J, et al. (2011) Use of the D-R model to define trends in the emergence of Ceftazidime-resistant Escherichia coli in China. PLoS One 6: e27295.
  13. 13. Xiong H, Xu H (2005) Grey Control. National Defense Industry Press, Beijing, China, ISBN: 7-118-04144-0 (in Chinese).
  14. 14. Pan H (2005) Time Series Analysis. University of International Business and Economics Press, Beijing, China, ISBN: 7-81078-567-2 (in Chinese).