Osteoarthritis of the hip or knee is a chronic condition mostly treated with analgesics and non-steroidal anti-inflammatory drugs, but these drugs can cause serious gastrointestinal and cardiovascular adverse events, especially with long term use. Disease modifying agents that not only reduce joint pain but also slow the progression of the condition would be desirable. Throughout the world for the past 10 years, the cartilage constituents chondroitin and glucosamine have been increasingly recommended in guidelines, prescribed by general practitioners and rheumatologists, and used by patients as over the counter medications to modify the clinical and radiological course of the condition.Global sales of glucosamine supplements reached almost $2bn (£1.3bn, €0.8bn) in 2008, which represents an increase of about 60% compared with 2003, with a forecasted continued growth through 2013 reaching $2.3bn. The oral administration of cartilage constituents in patients with osteoarthritis is thought to make up for the apparent cartilage loss in affected joints. Chondroitin is a highly hydrophilic, gel forming polysaccharide macromolecule. Its hydrocolloid properties convey much of the compressive resistance of cartilage. Glucosamine is an amino sugar that is a building block for the glycosaminoglycans that are part of the structure of cartilage. Ingested chondroitin and glucosamine are both partially absorbed in the intestine, and it has been suggested that some of the ingested amount reaches the joints.
Results from randomised trials about the effectiveness of chondroitin and glucosamine are conflicting. Trials that have reported large effects on joint pain were often hampered by poor study quality and small sample sizes, whereas large methodologically sound trials often found only small or no effects.
Bayesian approaches towards network meta-analyses allow a unified, coherent analysis of data recorded at multiple time points in randomised trials that compare either of these preparations with placebo or head to head.The approaches fully respect randomisation, account for the correlation of multiple observations within the same trial, and allow the estimation of the relative effectiveness of the different preparations and their combination. We performed a systematic review with network meta-analysis including data from large methodologically sound randomised trials at multiple follow-up times to determine the effect of these preparations on joint pain and on radiological progression of disease.
We searched the Cochrane Controlled Trials Register, Medline, Embase, and CINAHL (from inception to June 2010) using a combination of keywords and text words related to osteoarthritis; these were combined with generic and trade names of the various preparations plus a validated filter for controlled clinical trials. We also retrieved reports citing relevant articles via Science Citation Index (1981-2008). In addition, we manually searched conference proceedings and text books, screened reference lists of all obtained papers, and contacted content experts.
We included randomised trials with an average of at least 100 patients with knee or hip osteoarthritis per arm.Trials compared chondroitin sulphate, glucosamine sulphate, glucosamine hydrochloride, or the combination of any two with placebo or head to head. A sample size of 2×100 patients will yield more than 80% power to detect a small to moderate effect size of −0.40 at a two sided P=0.05, which corresponds to a difference of 1 cm on a 10 cm visual analogue scale between the experimental and control intervention. Two of four reviewers (BT, EN, SR, ST) evaluated reports independently for eligibility. They excluded trial arms with sub-therapeutic doses (<800 mg/day of chondroitin and <1500 mg/day of glucosamine, in accordance with doses licensed in Europe). Disagreements were resolved by consensus.
The prespecified primary outcome was absolute pain intensity reported in any of nine time windows organised in increments of three months (up to 3 months, 6, 9, 12, 15, 18, 21 months, and 22 months or more). If more than one time point was reported in a window, we extracted data nearest to the longest follow-up time included in that window; for the window covering 22 months or more, we extracted the follow-up closest to 24 months. When an article provided data on more than one pain scale, we referred to a previously described hierarchy of pain related outcomes and extracted the outcome that was highest on this list. Global pain took precedence over pain on walking and pain subscores on the Western Ontario and McMaster Universities (WOMAC) arthritis index. If a trial report provided data on both—for example, global pain scores and WOMAC pain subscores—we recorded only data on global pain scores. Secondary outcomes were changes in the minimum radiographic joint space between baseline and the end of treatment, the number of individuals withdrawn or who dropped out because of an adverse event, and the number of patients experiencing any adverse event.
Two of the four reviewers independently assessed concealment of allocation, blinding, and adequacy of analyses.Concealment of allocation was considered adequate if the investigators responsible for the selection of patients did not know before allocation which treatment was next in line (central randomisation, sealed, opaque, sequentially numbered assignment envelopes, coded drug packs, etc). Any procedures based on predictable generation of allocation sequences, and potentially transparent attempts to conceal allocation, such as non-opaque envelopes, were considered inadequate. We extracted the number of patients initially randomised and the number of patients analysed per group at each time point to distinguish between trials that had included all randomised patients in the analysis (intention to treat analysis) and trials that had not. Finally, we determined whether experimental preparations had undergone quality control—that is, if either a formally approved preparation was used or pharmacological laboratory analysis confirmed the content of the preparation. Disagreements were resolved by consensus.
Two of the four reviewers used a standardised form to extract in duplicate data on publication status, trial design, patients’ characteristics, treatment regimens, outcome modalities, and funding. Results of pain, joint space narrowing, and adverse events were extracted by one reviewer (ST) and cross checked by another (PJ). When necessary, means and measures of dispersion were approximated from figures in the reports.
We used an extension of multivariable Bayesian hierarchical random effects models for mixed multiple treatment comparisons with minimally informative prior distributions. It fully preserves the comparison of randomised treatments within each trial while combining all available comparisons between treatments and accounts for multiple comparisons within a trial when there are more than two treatment arms. For the analysis of effect sizes of pain, the model included random effects at the level of trials and time points. It accounted for the correlation of outcome data reported at different time points within a trial and allowed the estimation of the variance of treatment effects between trials (τ2). Effect sizes were calculated by dividing the differences in mean values between treatment groups in a time window by the median pooled standard deviation (SD) observed across all time points in a trial.23 If SDs were not provided, we calculated them from standard errors or confidence intervals as described elsewhere.10 24 An effect size of −0.20 SD units suggests an overlap in the distributions of reported pain scores in the experimental group with pain scores in placebo group in 85% and can be considered a small difference between experimental and control group. An effect size of −0.50 indicates an overlap in about 67% and can be considered a moderate difference, whereas −0.80 suggests an overlap in 53% and is considered a large difference.
To allow intuitive interpretation of pooled effects, we back transformed effect sizes to differences on a 10 cm visual analogue scale on the basis of a median pooled SD of 2.5 cm found in large scale osteoarthritis trials that assessed pain on a 10 cm visual analogue scale. We prespecified a minimal clinically important difference of 0.37 SD units, corresponding to 0.9 cm on a 10 cm visual analogue scale. This was based on the median minimal clinically important difference found in recent studies in patients with osteoarthritis. As the analysis of changes of minimum radiographic joint space did not include multiple time points, the model used for this outcome included only a random effect at the level of trials. To achieve comparability of the magnitude of effects on joint space and on pain and distinguish between small, moderate, and large treatment effects, we expressed differences in the width of the joint space as effect sizes, dividing the pooled estimates in millimetres by the median pooled SD of 1.2 mm found in included trials.
Whenever possible, we used results of intention to treat analysis including all randomised patients. Pooled effect sizes were estimated from the median of the posterior distribution. A negative effect size indicates a benefit of the experimental intervention. Corresponding 95% credible intervals were estimated from the 2.5th and 97.5th centiles of the posterior distribution. In the presence of minimally informative priors, credible intervals can be interpreted in a similar way to conventional confidence intervals. To determine whether the variation of treatment effects over time was over and above what would be expected by chance, we calculated a P value for heterogeneity across time points of follow-up. The P value was derived from the proportion of observations of the posterior distribution of the variance observed across time points within trials smaller than or equal to the variance within trials typically found in large osteoarthritis trials (0.01 for an effect size scale, 0.0625 for a 10 cm visual analogue scale).
To explore possible time trends, we included a linear term for time as a covariate in the analyses. We then included characteristics of the trials as covariates in the network meta-analysis to estimate effects according to concealment of allocation; intention to treat analysis; high methodological quality defined as adequate concealment of allocation, adequate blinding of patients, and the presence of an intention to treat analysis; source of funding (industry independent v other); type of glucosamine used (sulphate v hydrochlorides); quality control of preparations; and type of joint affected (knee v hip). P values for interaction between trial characteristics and treatment effect were derived from the posterior distribution of covariates and can be interpreted in the same way as a traditional P value for interaction.
Heterogeneity between trials was estimated from the median variance between trials (τ²) observed in the posterior distribution with the following prior distributions: a gamma distribution for heterogeneity between trials (1/τ² ∼ gamma(0.001,0.001)I(0,2000)), and a uniform distribution for heterogeneity between time points (τ ∼ unif(0,50)). In a sensitivity analysis we also used a uniform prior for the heterogeneity between trials. The consistency of the network was determined by use of inconsistency factors: the estimated difference between the effect size from direct comparisons within randomised trials and the effect size from indirect comparisons between randomised trials with one intervention in common. Estimates of variation and consistency are based on back transformations to differences on a 10 cm visual analogue scale. Goodness of ﬁt was assessed with Q-Q plots.
Finally, we performed pairwise meta-analyses with random effects at the level of trials and time points, as well as a simpler network meta-analysis including only one treatment effect per trial (absolute pain intensity at the longest follow-up available). Convergence of Markov chains was deemed to be achieved if plots of the Gelman-Rubin statistics indicated that widths of pooled runs and individual runs stabilised around the same value and their ratio around one.32 Accordingly, all analyses are based on 150 000 iterations, of which the first 50 000 were discarded as burn-in period. We used Stata (Stata Statistical Software: release 10; StataCorp LP 2005, College Station, TX) and WinBUGS (version 1.4; MRC Biostatistics Unit 2007, Cambridge, UK) for all analyses.
Out of 58 potentially eligible reports, 12 reports describing 10 trials met our inclusion criteria and were included in the network meta-analysis.