Hum. Reprod. Advance Access originally published online on January 18, 2007
Human Reproduction 2007 22(5):1353-1358; doi:10.1093/humrep/del521
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The predictive value of medical history taking and Chlamydia IgG ELISA antibody testing (CAT) in the selection of subfertile women for diagnostic laparoscopy: a clinical prediction model approach
1 Centre for Reproductive Medicine, Department of Obstetrics and Gynaecology 2 Department of Clinical Epidemiology and Biostatistics, Academic Medical Centre, Amsterdam, The Netherlands 3 Department of Obstetrics and Gynaecology, Máxima Medical Centre, Veldhoven, The Netherlands 4 Department of Obstetrics and Gynaecology, Aberdeen Maternity Hospital, Aberdeen, UK
5 To whom correspondence should be addressed at: Máxima Medical Centre, Department of Obstetrics and Gynaecology, De Run 4600, 5500 MB Veldhoven, The Netherlands. Tel: +31 40 888 8384; Fax: +31 40 888 8387; E-mail: s.f.coppus{at}amc.uva.nl
| Abstract |
|---|
|
|
|---|
BACKGROUND: Medical history taking as well as Chlamydia antibody titre (CAT) testing are currently used in the selection of patients for diagnostic laparoscopy with tubal patency testing. Most research has focused on the predictive value of CAT in isolation from medical history. We assessed therefore whether the combination of medical history and CAT improves the efficiency of selecting patients for laparoscopy as compared to the use of either medical history or CAT.
METHODS: Data of 207 consecutive subfertile women were used to create multivariable logistic regression models for the prediction of tubal disease as diagnosed by diagnostic laparoscopy.
RESULTS: The model with data of medical history only had an area under the receiver operating characteristic curve (AUC) of 0.65 (95% CI 0.560.74). Addition of CAT increased the AUC to 0.70 (95% CI 0.620.78) (P = 0.065). CAT was positive in 40 women and showed a sensitivity of 0.37 (95% CI 0.260.49) for a specificity of 0.88 (95% CI 0.820.93). In CAT positive women, a blank medical history did not decrease the probability of tubal disease. Of the 167 women tested CAT negative, 23 (14%) still had a high probability of disease due to their medical history and 11 of them (48%) showed tubal abnormalities on diagnostic laparoscopy.
CONCLUSIONS: CAT testing adds valuable information to a woman's risk profile based on her medical history. The combination of medical history taking and CAT testing has a better yield for diagnosing tubal disease than either of these alone.
Key words: tubal pathology/medical history/Chlamydia antibody titer/CAT/laparoscopy
| Introduction |
|---|
|
|
|---|
Tubal pathology affects approximately 1530% of subfertile women (Evers, 2002
Since it was reported in 1979 that women with positive Chlamydia IgG antibodies are more likely to have tubal pathology than those without (Punnonen et al., 1979
), numerous studies have reported on the value of Chlamydia antibody titre (CAT) testing to predict tubal pathology. Based on these reports, the Dutch Society for Obstetrics and Gynaecology (NVOG) recommends the use of CAT as a first-line test in the basic work-up of subfertile couples, with a fixed cut-off level (IgG MIF >1:32 or ELISA >1.1) above which post-infectious pelvic disease should be ruled out with laparoscopy and chromotubation (Swart et al., 1995
; NVOG, 2004). The pre-test probability of disease of a woman, based on specific risk factors or other characteristics from the woman's medical history is not formally taken into account. In contrast to the Dutch guideline, the guideline from the National Institute for Clinical Excellence (NICE) in the United Kingdom does not recommend CAT testing in the workup of subfertile women, but advises the use of past medical history in the woman to decide whether diagnostic laparoscopy should be performed or not (NICE, 2004).
Most diagnostic test research is interpreted in isolation from its clinical context. However, this does not reflect daily clinical practice, in which the diagnostic investigation is a consecutive process always starting with history taking and physical examination, followed by laboratory tests and more invasive and costly tests such as imaging (Moons et al., 2004
). With multivariable analysis one can reveal the add-on value of diagnostic tests (Moons and Grobbee, 2002
), in this case the additional value of CAT testing.
The present study therefore aimed to explore the diagnostic consequences of medical history taking by itself as recommended by the NICE guideline, CAT testing, as recommended by the NVOG, and the combined interpretation of the results from medical history and CAT testing, which is intuitively used in daily practice by many clinicians.
| Materials and methods |
|---|
|
|
|---|
Patients
We used data from a previously reported study, in which 207 consecutive women referred for evaluation of subfertility underwent a diagnostic laparoscopy. Characteristics of included women and the materials and methods used in this study have been published previously (Logan et al., 2003
Data analysis
We developed multivariable logistic regression models to predict tubal pathology using the following data from the medical history: female age, number of previous term deliveries, previous induced abortion, type of subfertility (i.e. primary or secondary), duration of subfertility, history of pelvic inflammatory disease (PID), history of sexually transmitted disease, previous use of an intrauterine contraceptive device and prior non-gynaecological pelvic surgery. We evaluated the prognostic value of CAT both as a dichotomous and continuous variable. First, we checked the assumption of linearity between the continuous variables age, duration of subfertility, number of previous term deliveries and titre of CAT on the one hand and tubal pathology on the other hand visualized with smoothed piecewise polynomials (splines), and formally tested with a generalized additive model. As we found no significant deviations from linearity in the relationship with the outcome, it was not necessary to transform these continuous variables. Subclasses within categorical variables were dichotomized. For dichotomous and continuous variables, univariable beta coefficients (
), odds ratios (ORs) and 95% confidence intervals (95% CI) as well as P-values were calculated. In the univariable analysis, the P-value for statistical significance was set at
0.05 in all analyses.
Subsequently, multivariable logistic regression analysis with a stepwise backwards selection procedure was used to construct the prediction models. Since the erroneous exclusion of a variable is more deleterious for a model than including too many factors (Mol et al., 2000
; Steyerberg et al., 2000
), we used a significance level of 30% for entry in the multivariable model and a significance level of 20% to stay in the model.
To adjust for overfitting, we performed internal validation with bootstrapping. Bootstrapping is a technique to evaluate the robustness of the model by simulating sampling variation in repeated samples from the study population. Cases are selected randomly from the original sample to create a new sample of the same size. As cases are drawn with replacement, some cases may be randomly selected multiple times and others not at all (Steyerberg et al., 2001
). We generated 500 bootstrap data sets in which the same multivariable logistic regression model was estimated. By analysing the difference between the model based on the original dataset and the bootstrap estimates, a shrinkage factor was calculated and used to correct the original model coefficients and associated probabilities for overfit.
We developed two models. The first model (clinical history model) was based on items from medical history only. In the second model (clinical history and CAT model) previous infection with C. trachomatis as determined with CAT ELISA was combined with medical history. This second model was developed to explore what the result of CAT adds to the information already obtained from medical history taking. To make comparisons, as a third model, CAT was dichotomized into positive or negative test results, as recommended by the Dutch guideline.
The discriminative capacity of the two constructed models was evaluated with receiver operating characteristic (ROC) curve analysis, in which the area under the curve (AUC) was calculated. Accuracy of CAT was plotted into the ROC curve as a comparison. Sensitivity was defined as the number of women with tubal pathology that was predicted correctly, and specificity was defined as the number of women without tubal pathology predicted correctly. Calibration of the model, which is the agreement between predicted and observed probabilities, was evaluated with the goodness-of-fit Hosmer and Lemeshow test statistic and by visual inspection of the calibration plots. Finally, we explored the diagnostic consequences of different strategies. Data were analysed using SPSS 12.0.1 (SPSS Inc., Chicago, IL, USA) and S-PLUS 2000 (MathSoft Inc.)
| Results |
|---|
|
|
|---|
The prevalence of tubal pathology was 30.4% (63/207). Demographic data of included women have been published elsewhere (Logan et al., 2003
|
The backward logistic regression selection procedure excluded duration of subfertility due to a P-value of 0.38 in the clinical history and CAT model. However, as one of the aims of the present study was to compare medical history taking with an integrated approach of history taking and CAT testing, exclusion of this variable in the second model would have prohibited this exploration. Analysis of the regression coefficients and 95% CI for the OR of this variable showed that duration of subfertility only had a minor contribution towards the estimated probabilities. To allow comparison, we therefore decided to force duration of subfertility into the second model (Table I). The possible additional value of use of CAT as a continuous variable in the prediction model was explored, but did not improve the discriminatory capacity (data not shown). As a result, CAT was used as dichotomous variable in the clinical history and CAT model.
Internal validation with bootstrapping showed a 17.5% overfit for the clinical history model, and a 14% overfit for the clinical history and CAT model. Therefore the regression coefficients of the model variables were corrected with these shrinkage factors.
The ROC curves of the two models are shown in Figure 1. The clinical history model showed an AUC of 0.65 (95% CI 0.560.74). The clinical history and CAT model increased the AUC to 0.70 (95% CI 0.620.78) (P = 0.065) (De Long et al., 1988
). As a comparison, the performance of CAT alonesensitivity 37% (95% CI 2649), specificity 88% (95% CI 8293) is plotted in the ROC space.
|
Calibration of the models was acceptable, as confirmed with the Hosmer and Lemeshow goodness-of-fit test statistic. This test statistic had a P-value of 0.30 for the clinical history model and a P-value of 0.35 for the clinical history and CAT model. Table II shows the agreement between different categories of predicted and observed probabilities in the two prediction models. Both models show an acceptable agreement between the mean predicted probabilities and the observed probability of tubal pathology. However, above a 30% predicted probability, both models underestimate the chance of disease. In Tables II, the categories of predicted probability of the clinical history and clinical history and CAT model are also cross-tabulated against the results of laparoscopy and CAT. Using the clinical history model, 28 women who tested CAT positive are classified as having a predicted probability of tubal pathology <30%. However, 12 of these 28 women (43%) still showed abnormalities at laparoscopy (Table II). Using the clinical history and CAT model the same 28 women who would have been classified as low risk patients based on clinical history only, shift to a high risk classification (i.e. a predicted probability of tubal disease >30%) (Table II). On the other hand, Table II also shows that among the 167 women who tested CAT negative, 23 (14%) still had a predicted probability of >30% of tubal disease due to their medical history. In this selected group of CAT negative women 48% (11/23) still showed tubal abnormalities on diagnostic laparoscopy.
|
To show the consequences of medical history taking by itself, CAT testing alone by itself and a combination of medical history taking and CAT testing, diagnostic and clinical parameters are listed in Table III. For comparison, the consequences of performing diagnostic laparoscopy in all women without any selection strategy are shown in the first column. The number of laparoscopies that has to be performed to detect one woman with tubal pathology is comparable when using history, CAT or history and CAT and much lower than without any workup. However, the detection rate of tubal pathology (true positive rate) is highest when using the combined model (54% versus 38 and 37% respectively).
|
| Discussion |
|---|
|
|
|---|
In this study we evaluated the efficiency of medical history taking, CAT testing and a combination of both in selecting women for laparoscopy to detect tubal pathology. We found that the discriminative capacity of the medical history model did not differ from that of CAT testing alone. However, combined interpretation of both history and CAT identified the women at highest risk for tubal disease, and increased sensitivity for a similar specificity. In CAT positive women, the probability of tubal disease is so high that a prompt diagnostic laparoscopy is indicated, irrespective of whether the woman has a suspect or non-suspect medical history. A small subgroup of CAT negative women (14%) still have a high probability of tubal disease due to their history, and laparoscopy will identify tubal abnormalities in around half of these women. The majority of women (70%) in this cohort were classified as having a probability <30% of tubal disease due to a negative CAT status and a non-suspect clinical history. Since 80% of these women will show no tubal abnormalities, laparoscopy in these women can be deferred.
Our study has some limitations. First, both our study cohort and the actual number of patients with positive factors in the medical history were relatively small. This limits the statistical power of this study and leads to relatively large confidence intervals. However, the strength of using this cohort of women is the fact that the decision to perform a diagnostic laparoscopy was irrespective of the result of CAT. This makes it possible to obtain estimates of diagnostic performance without partial verification bias, a limitation encountered in many other diagnostic studies on this subject (Logan et al., 2003
; Whiting et al., 2004
).
A second limitation of this study is the fact that we dichotomized the result of the CAT ELISA assay. A progressive increase in the likelihood of tubal pathology in patients with a higher titre of the Chlamydia antibody test has been shown for MIF tests (Thomas et al., 2000
) as well as for ELISA (Tanikawa et al., 1996
). The antibody titres of women in our cohort showed only little diversity, as a result of which we were not able to demonstrate this relation in the present study. It may be possible that use of a MIF test would have resulted in a larger diversity between IgG titres of the patients, leading to extra diagnostic information if incorporated into the prediction model. MIF tests however have been shown to suffer from a significant amount of cross-reactivity with Chlamydia pneumoniae. Furthermore, interpretation of these tests is subject to inter-observer variation (Gijsen et al., 2001
; Land et al., 2003
). Moreover, a recent study showed the pELISA used in this study to be the one most specific for a previous C. trachomatis infection (Land et al., 2003
).
A third limitation concerns our definition of tubal pathology. Broad criteria were used, namely presence of adhesions involving the Fallopian tubes and ovaries, clubbing of the tubes and hydrosalpinges or obstruction to the flow of dye irrespective of severity or whether the pathology was uni- or bilateral. Previous studies have shown that the diagnostic performance of CAT increases when the label tubal disease is limited to the more severe cases (Land et al., 1998
). Due to a low number of women with bilateral tubal disease in this study, we were unable to restrict our analysis to these cases. Detection of endometriosis by laparoscopy was not part of the intention to screen by medical history, since CAT will be unable to predict adhesions or damage due to endometriosis. Although this approach might be open to debate, this has been done previously in studies on the subject (Akande et al., 2003
; Logan et al., 2003
). Data on the capacity of medical history to predict endometriosis has been published previously (Calhaz-Jorge et al., 2004
).
One previous study has attempted to predict tubal pathology by means of a multivariable logistic regression model (Hubacher et al., 2004
). Women were divided into a high probability and low probability group based on a logistic regression model that included four variables, namely previous symptoms of PID, vaginal discharge, lower genital tract infection and presence of antibodies to C. trachomatis. The results demonstrated a sensitivity of 84% with a specificity of 29%. The authors concluded that the role of history taking in the evaluation of women with tubal factor infertility is limited. Although clinical history does not have the ultimate discriminative power to distinguish between patients with and without tubal pathology, we have demonstrated in the present study that it can be used to create individualized patient risk profiles and have shown the additional value of CAT testing upon medical history taking.
In our study we included women with both uni- and bilateral tubal disease. Unilateral tubal occlusion or tubo-ovarian adhesions may compromise fertility prospects moderately, but only bilateral tubal occlusion is known to virtually exclude all possibility of a spontaneous pregnancy (Mol et al., 1999
). Therefore, it is questionable whether all women with minor tubal pathology should be labelled as having tubal disease. To date, no evidence-based policies for the treatment of unilateral tubal disease have been published (Ahmad et al., 2006
), and therefore detection of one-sided tubal disease is unlikely to result in a major shift in treatment. Since most of these women have a low predicted probability of disease, postponing laparoscopy in these women may allow them to conceive spontaneously (Van der Steeg et al., 2006
), thereby making laparoscopy redundant. Unfortunately, we did not have data about treatment independent pregnancy rates in this cohort of patients, and cannot answer this question.
Our model allows differentiation between women with a high predicted chance of tubal pathology and women with a low chance. What probability of diagnostic uncertainty is acceptable for women is not known at the moment. It may be possible that women prefer to have 100% certainty about their tubal integrity. In that case, since no workup will ever be 100% accurate, all women will require diagnostic laparoscopy. This will increase costs and enlarge the pool of women with a diagnosis of minimal pathology of questionable prognostic significance (Collins et al., 1995
). Use of a multivariable prediction model including history and CAT allows a conjoint decision of whether or not to perform a laparoscopy, based on the woman's as well as the doctor's preferences. Depending on these preferences, one can perform a diagnostic laparoscopy in case of a high predicted probability, postpone laparoscopy for several months in cases where the predicted probability is low or use a less invasive alternative, i.e. hysterosalpingography or hysterosalpingo-contrast-sonography. Women should be adequately informed regarding the chance of delaying the diagnosis in case of low predicted probability and negative CAT status
In this study, we explored the value of medical history taking and Chlamydia antibody testing in predicting tubal subfertility by constructing diagnostic prediction models. Whether the use of such models is feasible and clinically effective in daily routine practice is not known. Further research will have to evaluate women's preferences regarding the trade-off between likelihood of detecting tubal pathology and the risks and discomfort associated with a potentially unnecessary diagnostic laparoscopy.
In conclusion, combined use of medical history taking and CAT testing has a superior diagnostic accuracy than one of these alone. CAT testing adds additional predictive value to a woman's medical history risk profile.
| Appendix |
|---|
|
|
|---|
The chance of a patient of having tubal pathology can be calculated with the following formula: probability of tubal pathology = 1/[1+ exp(
)] in which the
for the different models are: clinical history model:
= 1.764 + (0.629 x number of term deliveries) + (0.0116 x duration of subfertility) + (0.979 x positive history of PID) + (0.955 x history of non-gynaecologic pelvic surgery). history + cat model:
= 1.874 + (0.616 x number of term deliveries) + (0.0069 x duration of subfertility) + (1.020 x positive history of PID) + (1.201 x history of non-gynaecologic pelvic surgery) + (1.208 x positive CAT).
| Acknowledgements |
|---|
|
|
|---|
This work was supported by grant 91.46.364 in the VIDI-program of ZonMW, The Hague, The Netherlands, and by the Máxima Medical Centre, Veldhoven, The Netherlands.The funding sources had no involvement in the design, analysis or reporting of this study.
| Footnotes |
|---|
Part of this paper was presented as an oral presentation at the 22nd Annual Meeting of European Society of Human Reproduction and Embryology which was held in Prague, Czech Republic, 1821 June 2006.
| REFERENCES |
|---|
|
|
|---|
Ahmad G, Watson A, Vandekerckhove P, Lilford R. (2006) Techniques for pelvic surgery in subfertility. Cochrane Database Syst Rev 2 CD000221.
Akande VA, Hunt LP, Cahill DJ, Caul EO, Ford WC, Jenkins JM. (2003) Tubal damage in infertile women: prediction using Chlamydia serology. Hum Reprod 18:18411847.
American Fertility Society. (1985) Revised American Fertility Society classification of endometriosis. Fertil Steril 43:351352.[Medline]
Calhaz-Jorge C, Mol BW, Nunes J, Costa AP. (2004) Clinical predictive factors for endometriosis in a Portugese infertile population. Hum Reprod 19:21262131.
Collins JA, Burrows EA, Wilan AR. (1995) The prognosis for live birth among untreated infertile couples. Fertil Steril 64:2228.[Web of Science][Medline]
De Long ER, De Long DM, Clarke-Pearson DL. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837845.[CrossRef][Web of Science][Medline]
Evers JL. (2002) Female subfertility. Lancet 360:151159.[CrossRef][Web of Science][Medline]
Gijsen AP, Land JA, Goossens VJ, Leffers P, Bruggeman CA, Evers JL. (2001) Chlamydia pneumoniae and screening for tubal factor subfertility. Hum Reprod 16:487491.
Hubacher D, Grimes D, Lara-Ricalde R, de la Jarra J, Garcia-Luna A. (2004) The limited clinical usefulness of taking a history in the evaluation of women with tubal factor subfertility. Fertil Steril 81:610.[CrossRef][Web of Science][Medline]
Land JA, Evers JL, Goossens VJ. (1998) How to use Chlamydia antibody testing in subfertility patients. Hum Reprod 13:10941098.
Land JA, Gijsen AP, Kessels AG, Slobbe ME, Bruggeman CA. (2003) Performance of five serological chlamydia antibody tests in subfertile women. Hum Reprod 18:26212627.
Logan S, Gazvani R, McKenzie H, Templeton A, Bhattacharya S. (2003) Can history, ultrasound, or ELISA chlamydial antibodies, alone or in combination, predict tubal factor subfertility in subfertile women? Hum Reprod 18:23502356.
Mol BW, Swart P, Bossuyt PM, van der Veen F. (1999) Prognostic significance of diagnostic laparoscopy for spontaneous fertility. J Reprod Med 44:8186.[Web of Science][Medline]
Mol BW, van Wely M, Steyerberg EW. (2000) Using prognostic models in clinical infertility. Hum Fertil (Camb) 3:199202.[Medline]
Moons KG and Grobbee DE. (2002) Diagnostic studies as multivariable, prediction research. J Epidemiol Community Health 56:337338.
Moons KG, Biesheuvel CJ, Grobbee DE. (2004) Test research versus diagnostic research. Clin Chem 50:2736.[Medline]
Nederlandse Vereniging voor Obstetrie en Obstetrie. (2004) Orienterend fertiliteitsonderzoek [NVOG guideline diagnostic fertility workup].
National Institute for Clinical Excellence. (2004) Guideline CG11: Fertility: assessment and treatment for people with fertility problems. http://www.nice.org.uk/download.aspx?o=CG011fullguideline.
Punnonen R, Terho P, Nikkanen V, Meurman O. (1979) Chlamydial serology in infertile women by immunofluorescence. Fertil Steril 31:656659.[Web of Science][Medline]
Steyerberg EW, Eijkemans MJ, Harrell FE. Jr, Habbema JD. (2000) Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med 19:10591079.[CrossRef][Web of Science][Medline]
Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. (2001) Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 54:774781.[CrossRef][Web of Science][Medline]
Swart P, Mol BW, van der Veen F, van Beurden M, Redekop WK, Bossuyt PM. (1995) The accuracy of hysterosalpingography in the diagnosis of tubal pathology: a meta-analysis. Fertil Steril 64:486491.[Web of Science][Medline]
Tanikawa M, Harada T, Katagiri C, Onohara Y, Yoshida S, Terakawa N. (1996) Chlamydia trachomatis antibody titres by enzyme-lined immunosorbent assay are useful in predicting severity of adnexal adhesion. Hum Reprod 11:24182421.
Thomas K, Coughlin L, Mannion PT, Haddad NG. (2000) The value of Chlamydia trachomatis antibody testing as part of routine infertility investigations. Hum Reprod 15:10791082.
Van der Steeg JW, Steures P, Eijkemans MJ, Habbema JD, Hompes PG, Broekmans FJ, et al. (2006) Pregnancy is predictable: a large-scale prospective external validation of the prediction of spontaneous pregnancy in subfertile couples. Hum Reprod Epub ahead of press.
Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J. (2004) Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med 140:189202.
Submitted on July 14, 2006; resubmitted on November 16, 2006; accepted on December 14, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J.E. den Hartog, C.M.J.G. Lardenoije, J.L. Severens, J.A. Land, J.L.H. Evers, and A.G.H. Kessels Screening strategies for tubal factor subfertility Hum. Reprod., August 1, 2008; 23(8): 1840 - 1848. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.F.P.J. Coppus, H.R. Verhoeve, B.C. Opmeer, J.W. van der Steeg, P. Steures, M.J.C. Eijkemans, P.G.A. Hompes, P.M.M. Bossuyt, F. van der Veen, and B.W.J. Mol Identifying subfertile ovulatory women for timely tubal patency testing: a clinical decision rule based on medical history Hum. Reprod., October 1, 2007; 22(10): 2685 - 2692. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

, CAT.