The Tobacco and Alcohol Research Group blog

More about TARG

Recent Posts




A Summary of the E-cigarettes Summit 2016

by Jasmine Khouja @jasmine_khouja

On the 17th November I attended the E-cigarette Summit 2016 at the Royal Society in London. The summit brought together researchers, policy-makers, smoking cessation services and industry members to hear about the latest research, developments and challenges in the e-cigarette domain.

The summit was a one-day event packed full of information with 20 fast-paced (10-20 minutes) talks and 4 panel discussions. My five take home points from the summit were:

  1. Communication

One point which was raised on multiple occasions was that good communication of the research into e-cigarettes is key to the public understanding the risks and benefits of e-cigarette use. Unfortunately, the consensus was that the communication of e-cigarette research to the public is poor. Astonishingly, one speaker commented that someone had asked their daughter: “Is your dad still selling e-cigarettes and killing people?” This demonstrates how badly e-cigarettes have been portrayed, despite general consensus that they are much less harmful than cigarettes. Researchers are trying to communicate their research but face hurdles; some journals may be less likely to publish articles that are positive about vaping, meaning that it is harder to publish evidence that vaping is not as bad for you as cigarettes. The media are also hampering researchers’ efforts as they prefer stories which are anti-vaping and sometimes draw inaccurate conclusions from the evidence, which makes for more interesting stories. However, effective communication of the research is possible: Professor Peter Hajek and Dr Alex Freeman provided some useful advice to researchers which included not inferring human risks from animal studies, ensuring risks are directly compared to those of smoking, being a trustworthy source by being competent, honest and reliable, and providing neutral information without recommendations allowing the public to make their own informed decisions.

  1. The British Medical Association’s Guidelines

Communication of the benefits and risks of e-cigarettes isn’t limited to publications and the media; doctors are being asked about e-cigarettes by patients. Despite the evidence that the research community has provided that e-cigarettes are less harmful than cigarettes, the British Medical Association are yet to update their guidelines to encourage smokers to switch to e-cigarettes. There seemed to be apprehension stemming from the lack of known long-term effects, despite the fact that we know there are vastly fewer and reduced amounts of toxicants in e-cigarettes compared to cigarettes meaning the likelihood of long-term effects as bad as or worse than smoking are extremely unlikely.

  1. Recent Research

Many new studies were presented but the study that really caught my attention was discussed by Dr Lynne Dawkins. Lynne provided evidence for increased puffing behavior when participants are given lower doses of nicotine in their e-cigarettes [1]. She concluded that inhaling more vapour to receive the same amount of nicotine exposes vapers to unnecessary amounts of toxicants. This is very topical as the regulations set out by the Tobacco Products Directive (TPD) which will be fully implemented by May 2017 limit doses to 20 mg/mL meaning that some higher dosage (36 mg/mL) users may expose themselves to extra toxicants to receive the levels of nicotine they need when the higher dosage product become unavailable in the next six months.

  1. The Tobacco Products Directive

The TPD provides some form of regulation for e-cigarette manufacturers and distributors. The inclusion of e-cigarettes in the TPD was controversial due to e-cigarettes not containing tobacco and the restrictive nature of the regulations which were seen as unnecessary by some users and industry members. Part of the regulations included the thorough testing of e-cigarette products to ensure they were safe and the publication of the contents (including toxicants) so that the public could make informed decisions. To my dismay, I was informed that the information submitted by the e-cigarette companies so far will not be made publically accessible for roughly six months due to a system error. I was also informed that compliance with the regulations was low and that age of sale restrictions in particular did not seem to be being enforced. The system and enforcement of the TPD in relation to e-cigarettes needs improving so that consumers can access the information which the TPD states they should have access to and to protect young people whose brain development may be adversely affected by consuming nicotine.

  1. New Systems

As restrictive as the TPD is, new products are still being developed. A new type of e-cigarette is emerging onto the market called pods. These devices are small and similar in size to older less effective designs of e-cigarettes (cigalikes) but have the power and nicotine delivery of the newer more effective tank systems. The sleek, compact designs combined with the improved nicotine delivery systems which prevent overheating (which is associated with harmful byproducts such as formaldehyde) are likely to be very popular. These systems can also record information on how the devices are used (how long individuals puff for and how many puffs they take etc.) which could provide essential information to researchers on how e-cigarettes are used in real life situations.

The day culminated in a key note speech by the Attorney General for Iowa, Tom Miller. He commended the UK’s focus on e-cigarette research and the general positive stance our public health officials have taken in terms of e-cigarettes. He concluded his speech by asking for help from the UK to bring the US up to the same standards.


  1. PMID: 27650300

Psychiatric disorders: what’s the significance of non-random mating?

7960674098_2070f1fe64_bHardly a week passes without the publication of a study reporting the identification of genetic variants associated with an increasing number of behavioural and psychiatric outcomes. This partly driven by the growth in large international consortia of studies, as well as the release of data from very large studies such as UK Biobank. These consortia and large individual studies are now achieving the necessary sample sizes to detect the very small effects associated with common genetic variants,.

We’ve known for some time that psychiatric disorders are under a degree of genetic influence, but one puzzle is why estimates of the heritability of these disorders (i.e., the proportion of variability in risk of a disorder that is due to genetic variation) differs across disorders. Another intriguing question is why there appears to be a high degree of genetic comorbidity across different disorders; that is, common genetic influences that relate to more than one disorder. One possible answer to both questions may lie in the degree of non-random mating by disorder.

Non-random mating refers to the tendency for partners to be more similar than we would expect by chance on any given trait of interest. This is straightforward to see for traits such as height and weight, but less obvious for traits such as personality. A recent study by Nordsletten and colleagues investigated the degree of non-random mating for psychiatric disorders, as well as a selection of non-psychiatric disorders for comparison purposes.



The researchers used data from three Swedish national registers, using unique personal identification numbers assigned at birth. The data were linked to the Swedish National Patient Register (NPR), which includes diagnostic information on all individuals admitted to a Swedish hospital and, since 2001, on outpatient consultations. Individuals with multiple diagnoses could appear as a “case” in each separate analysis of these different diagnoses.

Cases of schizophrenia, bipolar disorder, autism spectrum disorder, anorexia nervosa, substance abuse, attention deficit hyperactivity disorder (ADHD), obsessive compulsive disorder (OCD), major depressive disorder, social phobia, agoraphobia, and generalised anxiety disorder were identified using standard protocols. For comparison purposes, cases of Crohn’s disease, type 1 and type 2 diabetes, multiple sclerosis and rheumatoid arthritis were also identified.

For each case (i.e., individuals with a diagnosis), five population controls were identified, matched on age, sex and area of residence. Mating relationships were identified through records of individual marriages, and through records of individuals being the biological parent of a child. The use of birth of a child was intended to capture couples who remained unmarried. For each member of a mated case pair a comparison sample was again generated, with the constraint that these controls not have the diagnosis of interest.

First, the proportion of mated pairs in the full case and control samples was summarised. Correlations were calculated to evaluate the relationship between the diagnostic status of each individual in a couple, first within and then across disorders. Logistic regression was used to estimate the odds of any diagnoses in mates of cases relative to mates of controls. Finally, the odds of any diagnosis in mates was estimated, and the relationship between the number of different disorders in a case and the presence of any psychiatric diagnoses in their mate explored.

Non-random mating is not a lack of promiscuity, it's the tendency for partners to be more similar than we would expect by chance on any given trait of interest.

Non-random mating is not a lack of promiscuity! It’s the tendency for partners to be more similar than we would expect by chance on any given trait of interest.


Cases showed reduced odds of mating relative to controls, and this differed by diagnosis, with the greatest attenuation among individuals with schizophrenia. In the case of some diagnoses (e.g., ADHD) this low rate of mating may simply reflect, at least in part, the young age of these populations.

Within each diagnostic category, there was evidence of a correlation in diagnostic status for mates of both sexes (ranging from 0.11 to 0.48), and there was also evidence of cross-disorder correlations, although these were typically smaller than within-disorder correlations (ranging from 0.01 to 0.42).

In general, if an individual had a diagnosis this was typically associated with a 2- to 3-fold increase in the odds of his or her mate having the same or a different disorder. This was particularly pronounced for certain conditions, such as ADHD, autism spectrum disorder and schizophrenia.

In contract to psychiatric samples, mating rates were consistently high among both men and women with non-psychiatric diagnoses, and correlations both across and within the conditions was rare (ranging from -0.03 to 0.17), with the presence of a non-psychiatric diagnosis associated with little increase in his or her spouse’s risk.

This general population study found an amazing amount of assortative (non-random) mating within psychiatric disorders.

This general population study found an amazing amount of assortative mating within psychiatric disorders.


These results indicate a striking degree of non-random mating for psychiatric disorders, compared with minimal levels for non-psychiatric disorders.

Correlations between partners were:

  • Greater than 0.40 for ADHD, autism spectrum disorder and schizophrenia,
  • Followed by substance abuse (range 0.36 to 0.39),
  • And detectable but more modest for other disorders, such as affective disorders (range 0.14 to 0.19).

The authors conclude the following:

  • Non-random mating is common in people with a psychiatric diagnosis.
  • Non-random mating occurs both within and across psychiatric diagnoses.
  • There is substantial variation in patterns of non-random mating across diagnoses.
  • Non-random mating is not present to the same degree for non-psychiatric diagnoses.


So, what are the implications of these findings?

First, non-random mating could account for the relatively high heritability of psychiatric disorders, and also explain why some psychiatric disorders are more heritable then others (if the degree of assortment varies by disorder).

This is because non-random mating will serve to increase additive genetic variation across generations until equilibrium is reached, leading to increased (narrow sense) heritability for any trait on which it is acting.

Second, non-random mating across psychiatric disorders (reflected, for example in a correlation of 0.31 between schizophrenia and autism spectrum disorder) could help to explain in part the observed genetic comorbidity across these disorders.

Non-random mating could explain why some psychiatric disorders are more heritable then others.

Non-random mating could explain why some psychiatric disorders are more heritable than others.

Strengths and limitations

This is an extremely well-conducted, authoritative study using a very large and representative data set. The use of a comparison group of non-psychiatric diagnoses is also an important strength, which gives us insight into just how strong non-random mating with respect to psychiatric diagnoses is.

The major limitations include:

  • Not being able to capture other pairings (e.g., unmarried childless couples)
  • A reliance on register diagnoses, which largely excludes outpatients etc
  • A lack of insight into possible mechanisms

This last point is interesting; non-random mating such as that observed in this study could arise because couples become more similar over time after they have become a couple (e.g., due to their interactions with each other) or may be more similar from the outset (e.g., because similar individuals are more likely to form couples in the first place, known as assortative mating).

The authors conclude that the non-random mating they observed may be due toassortative mating for two reasons. First, shared environment (which would capture effects of partner interactions) appears to play very little role in many psychiatric conditions. Second, neurodevelopmental conditions are present over the lifespan (i.e., before couples typically meet), which would suggest an assortative mating explanation for the observed similarity for at least these conditions.

Some disorders (e.g., schizophrenia) are associated with reduced reproductive success, and therefore should be under strong negative selection in the general population. However, these results suggest they may be positively selected for within certain psychiatric populations. In other words, these mating patterns could, in part, compensate for the reduced reproductive success associated with certain diagnoses, and explain why they persist across generations.

Implications for future research

Non-random mating also has implications for research, and in particular the use of genetic models. These models typically assume that mating takes place at random, but the presence of non-random mating (as indicated by this study) suggests that this should be taken into account in these models. This could be done by allowing for a correlation between partners, and neglecting this correlation may lead to an underestimate of heritability.


This study suggests that non-random mating is widespread for psychiatric conditions, which may help to provide insights into why these conditions are transmitted across generations, and why there is such a strong degree of comorbidity across psychiatric diagnoses. The results also challenge a fundamental assumption of many genetic approaches.

Assortative mating means that the person closest to an individual with a psychiatric disorder is also likely to have psychiatric problems.

Assortative mating means that, in general population terms, people in romantic relationships with those who have psychiatric disorders are also likely to have psychiatric problems themselves.


Primary paper

Nordsletten AE, Larsson H, Crowley JJ, Almqvist C, Lichtenstein P, Mataix-Cols D. (2016) Patterns of nonrandom mating within and across 11 major psychiatric disorders. JAMA Psychiatry 2016. doi: 10.1001/jamapsychiatry.2015.3192

Photo credits

Can a machine learning approach help us predict what specific treatments work best for individuals with depression?

by Marcus Munafò @MarcusMunafo

This blog originally appeared on the Mental Elf site on 11th Febraury 2016.


Understanding who responds well to treatment for depression is important both scientifically (to help develop better treatments) and clinically (to more efficiently prescribe effective treatments to individuals). Many attempts to predict treatment outcomes have focused on mechanistic pathways (e.g., genetic and brain imaging measures). However, these may not be particularly useful clinically, where such measures are typically not available to clinicians making treatment decisions. A better alternative might be to use routinely- or readily-collected behavioural and self-report data, such as demographic variables and symptom scores.

Chekroud and colleagues (2015) report the results of a machine learning approach to predicting treatment outcome in depression, using clinical (rather than mechanistic) predictors. Since there are potentially a very large number of predictors, examining all possible predictors in an unbiased manner (sometimes called “data mining”) is most likely to produce a powerful prediction algorithm.

Machine learning approaches are well suited to this approach, because they can identify patterns of information in data, rather than focusing on individual predictors. They can therefore identify the combination of variables that most strongly predict the outcome. However, prediction algorithms generated in this way need to be independently validated. By definition, they will predict the outcome in the data set used to generate the algorithm (the discovery sample). The real test is whether they also predict similar outcomes in independent data sets (the replication sample). This avoids circularity, and increases the likelihood the algorithm will be clinically useful.

Clinicians currently have no empirically validated mechanisms to assess whether a patient with depression will respond to a specific antidepressant.

Clinicians currently have no empirically validated mechanisms to assess whether a patient with depression will respond to a specific antidepressant.


The authors used data from a large, multicenter clinical trial of major depressive disorder (the STAR*D trial – Trivedi et al, 2006) as their discovery sample, and a separate clinical trial (the CO-MED trial, Rush et al, 2011) as their replication sample. Data were available on 1,949 participants in the STAR*D trial, and 425 participants in the CO-MED trial. The CO-MED trial consisted of three treatment groups, with participants randomised to receive either:

  1. Escitalopram-placebo
  2. Bupropion-escitalopram
  3. Venlafaxine-mirtazapine

The authors built a predictive model using all readily-available sources of information that overlapped for participants in both trials. This included:

  • A range of sociodemographic measures
  • DSM-IV diagnostic items
  • Symptom severity checklists
  • Eating disorder diagnoses
  • Whether the participants had taken specific antidepressant drugs
  • History of major depression
  • The first 100 items of the psychiatric diagnostic symptoms questionnaire.

In total, 164 variables were used.

For the training process, the machine learning approach divided the original sample (using the STAR*D data) into ten subsets, using nine of those in the training process to make predictions about the remaining subset. This process was repeated ten times, and the results averaged across these repeats. The final model built using the STAR*D data was then used to predict outcomes in the each of the CO-MED trial treatment groups separately.

The model was developed to detect people for whom citalopram (given to everyone in the first 12 weeks of the STAR*D trial) is beneficial, rather than predicting non-responders. It was constrained to require only 25 predictive features (i.e., clinical measures), to balance model performance (which should be greater with an increasing number of predictors) with clinical usability (since an algorithm requiring a very large number of predictors may be difficult to implement in practice).

Only 11-30% of patients with depression reach remission with initial treatment, even after 8-12 months.

Only 11-30% of patients with depression reach remission with initial treatment, even after 8-12 months.


The top three predictors of non-remission were:

  1. Baseline depression severity
  2. Feeling restless during the past 7 days
  3. Reduced energy level during the past 7 days

The top three predictors of remission were:

  1. Currently being employed
  2. Total years of education
  3. Loss of insight into one’s depressive condition

Overall, the model predicted outcome in the STAR*D data with:

  • An accuracy of 64.6% – it identified 62.8% of participants who eventually reached remission (i.e., sensitivity), and 66.2% of non-remitters (i.e., specificity)
  • This is equivalent to a positive predictive value (PPV) of 64.0% and a negative predictive value (NPV) of 65.3%
  • The performance of the model was considerably better than chance (P = 9.8 × 10-33)

In the CO-MED data, the model:

  • Pedicted outcome in the escitalopram-placebo group:
    • Accuracy 59.6%, 95% CI 51.3% to 67.5%,
    • P = 0.043,
    • PPV 65.0%,
    • NPV 56.0%.
  • Escitalopram-bupropion group
    • Accuracy 59.7%, 95% CI 50.9% to 68.1%,
    • P = 0.023,
    • PPV 59.7%,
    • NPV 59.7%.

However, there was no statistical evidence that it performed better than chance in the venlafaxine-mirtazapine group:

  • Accuracy 51.4%, 95% CI 42.8% to 60.0%,
  • P = 0.53,
  • PPV 53.9%,
  • NPV 50.0%.
Could predictive models that mine existing trial data help us prospectively identify people with depression who are likely to respond to a specific antidepressant?

Could predictive models that mine existing trial data help us prospectively identify people with depression who are likely to respond to a specific antidepressant?


The authors conclude that their model performs comparably to the best biomarker currently available (an EEG-based index) but is less expensive and easier to implement.

The outcome (clinical remission, based on a final score of 5 or less on the 16-item self-report Quick Inventory of Depressive Symptomatology, after at least 12 weeks) is associated with better function and better prognosis than response without remission.

Strengths and limitations

There are some strengths to this study:

  1. First, it attempts to build a prediction algorithm using data that are already collected routinely in clinical practice, or could be easily incorporated into routine practice.
  2. Second, the prediction algorithm shows some evidence of generalisability to an independent sample.
  3. Third, the algorithm also shows some degree of specificity, by performing best in the escitalopram-treated groups in the CO-MED data.

However, there are also some limitations:

  1. First, there is a clear reduction in how well the algorithm predicts treatment outcome in the discovery sample (STAR*D) compared with the replication sample (CO-MED). This illustrates the need for an independent replication sample in studies of this kind.
  2. Second, and more importantly, although the algorithm performed better in the escitalopram-treated groups in CO-MED, it’s not clear that there was any evidence that performance was different across the three arms – the 95% confidence intervals for the venlafaxine-mirtazapine group (42.8% to 60.0%) include the point estimates for the other two groups (escitalopram-placebo: 59.6%, escitalopram-bupropion: 59.7%). Therefore, although there is some evidence of specificity, it is indirect, and the algorithm may in fact predict treatment outcome in general, rather than in those who have received a specific treatment, at least in part.
  3. Third, models of this kind cannot tell us whether the variables that predict treatment outcome are causal. This may not matter if our focus is on clinical prediction, although if they are not causal then the prediction algorithm may not generalize well to other populations. For example, in both the discovery and replication sample participants had been recruited into clinical trials, and therefore may not be representative of the wider population of people with major depressive disorder. Causal anchors are likely to be more important if we are interested in mechanistic (rather than clinical) predictors.


Ultimately, being able to simultaneously identify individuals likely to respond well to drug A and not respond to drug B will be clinically valuable, and is the goal of stratified medicine. This study represents only the first step towards being able to identify likely responders and non-responders for a single drug (in this case, citalopram); in particular, although there was some evidence for specificity in this study, it was relatively weak.

Ultimately, with larger datasets that include multiple treatment options (including non-pharmacological interventions), it may be possible to match people to the treatment option they are most likely to respond successfully to. The focus on routinely- or readily-collected data means that it gives an insight into what clinical prediction algorithms for treatment response in psychiatry may look like in the future.

This innovative study may open the door to predict more personalised medicine for people with depression.

This innovative study (and others like it) may open the door to predict more personalised medicine for people with depression.


Primary paper

Chekroud AM, Zotti RJ, Shezhad Z, Gueorguieva R, Johnson MK, Trivedi MH, Cannon TD, Krystal JH, Corlett PR. (2015) Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 2015. doi: S2215-0366(15)00471-X [Abstract]

Other references

Trivedi MH, Rush AJ, Wisniewski SR, Nierenberg AA, Warden D, Ritz L, Norquist G, Howland RH, Lebowitz B, McGrath PJ, Shores-Wilson K, Biggs MM, Balasubramani GK, Fava M; STAR*D Study Team. (2006) Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry. 2006 Jan;163(1):28-40. [PubMed abstract] [Wikipedia page]

Rush AJ, Trivedi MH, Stewart JW, Nierenberg AA, Fava M, Kurian BT, Warden D, Morris DW, Luther JF, Husain MM, Cook IA, Shelton RC, Lesser IM, Kornstein SG, Wisniewski SR. (2011) Combining medications to enhance depression outcomes (CO-MED): acute and long-term outcomes of a single-blind randomized study.Am J Psychiatry. 2011 Jul;168(7):689-701. doi: 10.1176/appi.ajp.2011.10111645. Epub 2011 May 2. [PubMed abstract]

Photo credits

– See more at:

Smoking and risk of schizophrenia: new study finds a dose-response relationship

by Marcus Munafo @MarcusMunafo

This blog originally appeared on the Mental Elf site on 1st July 2015.

Almost exactly a year ago, a landmark study identified 108 genetic loci associated with schizophrenia (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). In a Mental Elf post on that study I wrote: “Genetic studies also don’t rule out an important role for the environment – [genome-wide association studies] might even help identify other causes of disease, by identifying loci associated with, for example, tobacco use.”

I mentioned this because one of the loci identified is strongly associated with heaviness of smoking. There are two possible explanations for this: either this locus influences both smoking and schizophrenia, or smoking causes schizophrenia.

Smoking and schizophrenia are highly co-morbid; the prevalence of smoking among people with a diagnosis of schizophrenia is much higher than in the general population. It is widely believed that this is because smoking helps to alleviate some of the symptoms of schizophrenia, or the side-effects of antipsychotic medication.

The possibility that smoking itself may be a risk factor for schizophrenia has generally not been widely considered. Now, however, intriguing evidence has emerged that it may be, from a large study of data from Swedish birth and conscript registries (Kendler et al, 2015).

The leading causes of premature mortality in people with schizophrenia are ischaemic heart disease and cancer, both heavily related to smoking.


The authors linked nationwide Swedish registers via the unique 10-digit identification number assigned at birth or immigration to all Swedish residents. Data on smoking habits were collected from the Swedish Birth Register (for women) and the Military Conscription Register (for men). The date of onset of illness was defined as the first hospital discharge diagnosis for schizophrenia or non-affective psychosis.

Cox proportional hazard regressions were used to investigate the associations between smoking and time to schizophrenia diagnosis. To evaluate the possibility that smoking began during a prodromal period (where symptoms of schizophrenia may emerge prior to a full diagnosis), buffer periods of 1, 3 and 5 years were included in the models. In the female sample, data from relatives (siblings and cousins) were also used to control for familial confounding (genetic and environmental).


Smoking status information was available for 1,413,849 women, and 233,879 men.

There was an association between smoking at baseline and a subsequent diagnosis of schizophrenia for:

  • Women
    • Light smoking: hazard ratio 2.21, (95% CI 1.90 to 2.56)
    • Heavy smoking: hazard ratio 3.45 (95% CI 2.95 to 4.03)
  • Men
    • Light smoking: hazard ratio 2.15 (95% CI 1.25 to 3.44)
    • Heavy smoking: hazard ratio 3.80 (95% CI 1.19 to 6.60)

Adjustment for socioeconomic status and prior drug abuse (i.e., confounding) weakened these associations slightly.

Taking into account the possibility of smoking onset during a prodromal period also did not weaken these associations substantially, irrespective of whether the buffer period (from smoking assessment to the date at which a first schizophrenia diagnosis would be counted) was 1-, 3- or 5-years. Theoretically, if prodromal symptoms of schizophrenia lead to smoking onset (i.e., reverse causality) the smoking-schizophrenia association should weaken with longer buffer periods.

Finally, the co-relative analyses compared the association between smoking and schizophrenia in the female sample, within pairs of relatives of increasing genetic relatedness who had been selected on the basis of discordance for smoking (i.e., one smoked and one did not). If the smoking-schizophrenia association arises from shared familiar risk factors (genetic or environmental) the association should weaken with increasing familial relatedness. Instead, only modest decreases were observed.

As a validation check on the accuracy of their measure of smoking behaviour, the authors confirmed that heavy smoking was more strongly associated with both lung cancer and chronic obstructive pulmonary disease, two diseases known to be caused by smoking.

These results show a dose-response relationship between smoking and risk of schizophrenia, i.e. the more you smoke, the stronger the association. 


This study provides clear evidence of a prospective association between cigarette smoking and a subsequent diagnosis of schizophrenia. However, observational associations are notoriously problematic, because these associations may arise because of confounding (measured and unmeasured), or reverse causality. Since these analyses were conducted on observational data, these limitations should be borne in mind and we cannot say with certainty that smoking is a causal risk factor for schizophrenia.

Nevertheless, the authors conducted a number of analyses to attempt to rule out different possibilities. First, the associations were weakened only slightly when adjusted for socioeconomic status and prior drug abuse, so the impact of measured confounders appears to be modest (although other confounding could still be occurring). Second, the inclusion of a buffer period to account for smoking onset during a prodromal period also weakened the associations only slightly, which is not consistent with a reverse causality interpretation. Finally, the co-relative analysis did not indicate that the association differed strongly across levels of familial relatedness, suggesting that the impact of unmeasured familial confounders (both genetic and environmental) is relatively modest.

This study provides clear evidence of a prospective association between cigarette smoking and a subsequent diagnosis of schizophrenia.


There are some limitations to the study that are worth bearing in mind:

  1. First, there were no data on lifetime smoking, although the authors validated their measure of smoking against outcomes known to be caused by smoking.
  2. Second, the authors used clinical diagnoses, and included both schizophrenia and non-affective psychosis, so the specificity of the findings to these outcomes is uncertain.
  3. Third, because of the small number of schizophrenia diagnoses the co-relative analyses used non-affective psychosis only.

This study is not enough to say with certainty that smoking is a causal risk factor for schizophrenia.


There are three main ways in which the association between smoking and schizophrenia might arise:

  1. Schizophrenia causes smoking,
  2. Smoking causes schizophrenia, and
  3. The association arises from risk factors common to both.

This study suggests that the first mechanism cannot fully account for the association; if anything there was more support for the third mechanism, including stronger evidence for a role for familial factors than for socioeconomic status and drug abuse. However, critically, this study also finds support for the second mechanism, including a dose-response relationship between smoking and risk of schizophrenia.

Despite this study’s strengths, and the care taken by the authors to explore the three possible mechanisms that could account for the association between smoking and schizophrenia, no single study is definitive. However, evidence is emerging from other studies that support the possibility that smoking may be a causal risk factor for schizophrenia.

Recently, McGrath and colleagues have reported that earlier age of onset of smoking is prospectively associated with increased risk of non-affective psychosis (McGrath et al, 2015).

In addition, Wium-Andersen and colleagues report that tobacco smoking is causally associated with antipsychotic medication use (but not antidepressant use), in a Mendelian randomisation analysis that uses genetic variants as unconfounded proxies for heaviness of smoking (Wium-Andersen et al, 2015).

Identifying potentially modifiable causes of diseases such as schizophrenia is a crucial part of public health efforts. There is also often reluctance among health care professionals to encourage patients with mental health problems (including schizophrenia) to attempt to stop smoking. If smoking is shown to play a causal role in the development of schizophrenia, there may be more willingness to encourage cessation. Since the majority of the mortality associated with schizophrenia is due to tobacco use (Brown et al, 2000), helping people with schizophrenia to stop is vital to their long-term health.

There is now mounting evidence that supports the possibility that smoking may be a causal risk factor for schizophrenia.


Primary paper

Kendler, K.S., Lonn, S.L., Sundquist, J & Sundquist, K. (2015). Smoking and schizophrenia in population cohorts of Swedish women and men: a prospective co-relative control study. American Journal of Psychiatry. doi: 10.1176/appi.ajp.2015.15010126 [Abstract]

Other references

Schizophrenia Working Group of the Psychiatric Genomics Consortium (2014). Biological insights from 108 schizophrenia-associated genetic loci. Nature, 511, 421-427. doi: 10.1038/nature13595

McGrath, J.J., Alati, R., Clavarino, A., Williams, G.M., Bor, W., Najman, J.M., Connell, M. & Scott, J.G. (2015). Age at first tobacco use and risk of subsequent psychosis-related outcomes: a birth cohort study. Australian and New Zealand Journal of Psychiatry. [PubMed abstract]

Wium-Andersen, M.K., Orsted, D.D. & Nordestgaard, B.G. (2015). Tobacco smoking is causally associated with antipsychotic medication use and schizophrenia, but not with antidepressant medication use or depression. International Journal of Epidemiology, 44, 566-577. [Abstract]

Brown S, Inskip H, Barraclough B. (2000) Causes of the excess mortality of schizophrenia. Br J Psychiatry. 2000 Sep;177:212-7.

– See more at:

E-cigarettes and teenagers: cause for concern?

By Marcus Munafo @MarcusMunafo 

This blog originally appeared on the Mental Elf site on 20th April 2015


Electronic cigarettes (e-cigarettes) are a range of products that deliver vapour which typically contains nicotine (although zero-nicotine solutions are available). The name is misleading because some products are mechanical rather than electronic, and because they are not cigarettes. While first-generation products were designed to be visually similar to cigarettes, second- and third-generation products are visually distinctive and come in a variety of shapes and sizes. Critically, these products do not contain tobacco, and are therefore intended to deliver nicotine without the harmful constituents of tobacco smoke.

There has been rapid growth in the popularity and use of e-cigarettes in recent years, accompanied by growth in their marketing. At present they are relatively unregulated in many countries, although countries are introducing various restrictions on their availability and marketing. For example, a ban on sales to under-18s will be introduced in England and Wales in 2015.

These products have stimulated considerable (and often highly polarized) debate in the public health community. On the one hand, if they can support smokers in moving away from smoking they have enormous potential to reduce the harms associated with smoking. On the other hand, the quality and efficacy of these products remains largely unknown and is likely to be highly variable, and data on the long-term consequences of their use (e.g., the inhalation of propylene glycol vapour and flavourings) is lacking. There is also a concern that these products may re-normalise smoking, or act as a gateway into smoking.

E-cigarettes and teenagers: a gateway


This study reports the results of a survey conducted by Trading Standards in the North-West of England on 14 to 17 year-old students. The survey focuses on tobacco-related behaviours, and a question on access to e-cigarettes was introduced in 2013. This enabled identification of factors associated with e-cigarette use among people under 18 years old.

The study used data from the 5th Trading Standards North West Alcohol and Tobacco Survey among 14 to 17 year-olds in North-West England, conducted in 2013. The questionnaire was made available to secondary schools across the region through local authority Trading Standards departments, and delivered by teachers during normal school lessons. Compliance was not recorded, and the sample was not intended to be representative but to provide a sample from a range of communities.

The survey consisted of closed, self-completed questions covering sociodemographic variables, alcohol consumption and tobacco use. There were also questions on methods of access to alcohol and tobacco, as well as involvements in violence when drunk. E-cigarette access was assessed by the question “Have you ever tried or purchased e-cigarettes?”.

The study used data from the North West Alcohol and Tobacco Survey, which asked 14 to 17 year-olds lots of questions about their substance use behaviour.


A total of 114 schools participated, and the total dataset included 18,233 participants, of which some were removed for missing data or spoiled questionnaires (e.g., unrealistic answers), so that the final sample for analysis was 16,193. Some of the main findings of the survey included:

  • In total, 19.2% of respondents reported having accessed e-cigarettes, with this being higher in males than females, and increasing with age and socioeconomic deprivation.
  • Level of e-cigarette access was higher among those who had smoked, ranging from 4.9% of never smokers, through 50.7% of ex-smokers, 67.2% of light smokers and 75.8% of heavy smokers.
  • E-cigarette use was associated with alcohol use, with those who drank alcohol more likely to have accessed e-cigarettes than non-drinkers, as well as with smoking by parents/guardians.

Nearly 1 in 5 of the young people surveyed


The authors conclude that their results raise concerns around the access to e-cigarettes by children, particularly among those who have never smoked cigarettes. They argue that their findings suggest that the children who access e-cigarettes are also those most vulnerable to other forms of substance use and risk-taking behavior, and conclude with a call for the “urgent need for controls on e-cigarette sales to children”. The study has some important strengths, most notably its relatively large size, and ability to determine which respondents were living in rich and poor areas.

Understanding the determinants of e-cigarette use, and patterns of use across different sections of society, is important to inform the ongoing debate around their potential benefits and harms. However, it is also not clear what this study tells us that was not already known. The results are consistent with previous, larger surveys, which show that young people (mostly smokers) are trying e-cigarettes. Critically, these previous surveys have shown that while some young non-smokers are experimenting with electronic cigarettes, progression to regular use among this group is rare. Product labels already indicate that electronic cigarettes are not for sale to under-18s, and in 2014 the UK government indicated that legislation will be brought forward to prohibit the sale of electronic cigarettes to under-18s in England and Wales (although at present no such commitment has been made in Scotland).

This study does not add anything significant to our knowledge about e-cigarettes.


There are a number of important limitations to this study:

  • As the authors acknowledge, this was not meant to be a representative survey, and the results can therefore not be generalized to the rest of the north-west of England, let alone the wider UK.
  • As a cross-sectional survey it was not able to follow up individual respondents, for example to determine whether never smokers using e-cigarettes progress to smoking. This problem is common to most e-cigarette surveys to date.
  • The question asked does not tell us whether the participants actually used the e-cigarette they accessed, or what liquid was purchased with the e-cigarette (e.g., the concentration of nicotine). Zero-nicotine solutions are available, and there is evidence that these solutions are widely used by young people.
  • The results are presented confusingly, with numerous percentages (and percentages of percentages) reported. For example, 4.9% of never smokers reported having accessed e-cigarettes, but this is less than 3% of the overall sample (fewer than 500 out of 16,193 respondents). This is potentially an important number to know, but is not reported directly in the article.


This study does not add much to what is already known. Young people experiment with substances like tobacco and alcohol, and as e-cigarettes have become widely available they have begun to experiment with these too. However, to describe electronic cigarette use as “a new drug use option” and part of “at-risk teenagers’ substance using repertoires” is probably unnecessarily alarmist, given that:

  1. There is evidence that regular use of e-cigarettes among never smokers is negligible
  2. There is little evidence of e-cigarette use acting as a gateway to tobacco use
  3. The likelihood that e-cigarette use will be associated with very low levels of harm

It's alarmist to suggest


Primary reference

Huges K, Bellis MA, Hardcastle KA, McHale P, Bennett A, Ireland R, Pike K. Associations between e-cigarette access and smoking and drinking behaviours in teenagers. BMC Public Health 2015; 15: 244. doi: 10.1186/s12889-015-1618-4

Other references

Young Persons Alcohol and Tobacco Survey 2013. Lancashire County Council’s Trading Standards.

Is moderate alcohol consumption good for you?

By Marcus Munafo @MarcusMunafo 

This blog originally appeared on the Mental Elf site on 13th March 2015


This is something many of us would like to be true – the idea that the occasional glass of wine has health benefits is compelling in a society like the UK where alcohol consumption is widespread.

Certainly the observational data indicate a J-shaped associationbetween alcohol consumption and mortality (O’Keefe et al, 2007), with the lowest mortality observed at low to moderate levels of alcohol consumption (equivalent to perhaps a pint of beer a day for men, and about half that for women).

However, observational studies like this are fraught with difficulties.

  1. First, people may not report their alcohol consumption reliably.
  2. Second, and more importantly, alcohol consumption is associated with a range of other lifestyle behaviours, such as diet and smoking, which will themselves influence mortality, so that isolating any specific association of alcohol is extremely difficult.
  3. Third, how non-drinkers are defined may be important – lifetime abstainers may be different from former drinkers (who could have stopped drinking because of health problems).

The last point illustrates the problem of reverse causality; alcohol consumption may be causally associated with a range of health outcomes, but some of those health outcomes may also be causally associated with alcohol consumption.

In a recent study in the BMJ, the authors argue that the problems associated with the choice of an appropriate referent group of non-drinkers are often overlooked in research into alcohol-related mortality.

They also argue that age is not adequately considered, which may be relevant because of physiological changes to the ageing body that influence elimination of blood alcohol. Knott and colleagues explored the association between alcohol consumption and all cause mortality for people aged less than 65 years and aged 65 or more, and separated never and former drinkers.

The lowest mortality observed is at low to moderate levels of alcohol consumption (equivalent to perhaps a pint of beer a day for men, and about half that for women).


The authors used data from the Health Survey for England, an annual, nationally-representative cross sectional survey of the general population, linked to national mortality registration data.

The analysis focused on adults aged 50 years or older, and investigated two measures of alcohol consumption: self-reported average weekly consumption over the past year, and self-reported consumption on the heaviest day in the past week. The outcome was all cause mortality (i.e., any death recorded during the period of data collection).

The primary statistical analyses were proportional hazards analyses for each of the two age groups of interest (less than 65 years and 65 years or more). They tested for whether any associations observed differed between males and females and, given strong evidence of a sex-dose interaction, reported sex-specific models for each age group of interest.

Statistical adjustment was made for a comprehensive list of potential confounders, such as geographical location, ethnicity, cigarette smoking, obesity and a range of socio-demographic variables.


Protective associations were only observed with statistical significance (a point I’ll return to below) among younger men (aged 50 to 64 years) and older women (65 years or older), using a never drinker referent category after full adjustment.

Among younger men a protective relationship between alcohol consumption and all cause mortality was observed among those who reported consuming 15.1 to 20 units per week (hazard ratio 0.49, 95% confidence interval 0.26 to 0.91).

Among older women, the range of protective use was broader but lower, with reductions in hazards of all cause mortality observed at all consumption levels up to 10 units per week of less.

The study supports a moderate protective effect of alcohol.


The authors conclude that observed associations between low levels of alcohol consumption and reduced all cause mortality may in part be due to inappropriate selection of a referent group (all non-drinkers, rather than never drinkers) and inadequate statistical adjustment for potential confounders.

They also conclude that beneficial dose response relationships between alcohol consumption and all cause mortality may be specific to women aged 65 years or older.

There is a relative lack of data on older populations in relation to the association between alcohol consumption and all cause mortality, which this study addresses. The consideration of different definitions of the referent category is also valuable – the authors are correct that conventional definitions of “non-drinker” may be problematic.

However, to what extent should we believe the conclusion that beneficial dose response relationships may be age- and sex-specific?

As David Spiegelhalter has pointed out, the authors base their conclusion on which associations achieved statistical significance and which did not. However, the hazard ratios for all cause mortality are consistently lower for alcohol consumers than non-consumers in this study. Although the confidence intervals are wider for some consumption levels and in some sub-groups (males vs females, or younger vs older), the individual hazard ratios are all consistent with each other.

The wide confidence intervals reflect a lack of statistical power, principally due to the small number of never drinkers, and the small number of deaths. Although the data set is relatively large, by carving it up into a number of sub-groups, the statistical power for the individual comparisons is reduced. Spiegelhalter points out that the entire comparison for participants in the younger age group is based on 17 deaths in the male baseline group and 19 deaths in the female group.

As Andrew Gelman and Hal Stern have said, the difference between “significant” and “non-significant” is not (necessarily) itself significant. Indeed, focusing on statistical significance (rather than effect size and precision) can lead to exactly the problems encountered here. Low statistical power is also a problem, reducing the likelihood that a statistically significant finding is true, and (perhaps more importantly) dramatically reducing the precision of our effect size estimates.

Should we believe that beneficial dose response relationships are age- and sex-specific?

Strengths and limitations

There are some strengths to this study, notably the use of a more considered referent category of never drinkers, and the statistical adjustment for a broad range of potential confounders.

However, the primary conclusion of the authors does not seem to be borne out by their own data – hazard ratios for all cause mortality are lower for alcohol consumers than non-consumers at all levels of consumption, for both men and women, and for both the younger and older age groups.

Is moderate alcohol consumption good for us then? The observational data, including that from this study, continues to suggest so.

However we should also remain wary of evidence from observational studies, which can be notoriously unreliable, and cannot confirm that an association is causal. Ultimately, we may need to use novel methods to answer this question, such as Mendelian randomization which utilized the properties of genetic variants to enable stronger causal inference.

We should be wary of evidence from observational studies, which can be notoriously unreliable, especially in underpowered studies like this one.


Knott CS, Coombs N, Stamatakis E, Biddulph JP. (2015) All cause mortality and the case for age specific alcohol consumption guidelines: pooled analyses of up to 10 population based cohorts (PDF). British Medical Journal, 350, h384. doi: 10.1136/bmj.h384

O’Keefe HF, Bybee KA, Lavie CJ. (2007) Alcohol and cardiovascular health: the razor-sharp double-edged sword. J Am Coll Cardiol. 2007;50(11)

Spiegelhalter D. (2015) Misleading conclusions from alcohol protection study. Understanding Uncertainty website, last accessed 11 Mar 2015.

The missing heritability problem

By Marcus Munafo

Missing heritability has been described as genetic “dark matter”In my last post I described the transition from candidate gene studies to genome-wide association studies, and argued that the corresponding change in the methods used, focusing on the whole genome rather than on a handful of genes of presumed biological relevance, has transformed our understanding of the genetic basis of complex traits. In this post I discuss the reasons why, despite this success, we still have not accounted for all the genetic influences we expect to find.

As I discussed previously, genome-wide association studies (GWAS) have been extremely successful in identifying genetic variants associated with a range of disease outcomes – countless replicable associations have emerged over the last few years. Nevertheless, despite this success, the proportion of variability in specific traits accounted for so far is much less than what twin, family and adoption studies would lead us to expect. The individual variants identified are associated with a very small proportion of variance in the trait of interest (typically 0.1% of less), so that together they still only account for a modest proportion. Twin, family and adoption studies would lead us to expect that 50% or more of the variance in many complex traits is attributable to genetic influences, but so far we have found only a small fraction of that total. This has become known as the “missing heritability” problem. Where are the other genes? Should we be seeking common genetic variants of smaller and smaller effect, in larger and larger studies? Or is there a role for rare variants (i.e., those which occur with a low frequency in a particular population, typically a minor allele frequency less than 5%), which may have a larger effect?

It is clear that some missing heritability will be accounted for by variants that have not yet been identified via GWAS. Most GWAS genotyping chips don’t capture rare variants very well, but evolutionary theory predicts that those mutations that strongly influence complex phenotypes will tend to occur at low frequencies. Under the evolutionary neutral model, variants with these large effects are predicted to be rare. However, under the same model, while rare variants of large effect constitute the majority of causal variants, they still only contribute a small proportion of phenotypicvariance in a population, because they are rare. On the other hand, common variants of small effect contribute a greater overall proportion of variance. There are new methods which use a less stringent threshold for including variants identified via GWAS – instead of only including those that reach “genomewide significance” (i.e., a P-value < 10-8 – see my earlier post), those which reach a much more modest level of statistical evidence (e.g., P < 0.5) are included. This much more inclusive approach has shown that when considered together, common genetic variants do in fact seem to account for a substantial proportion of expected heritability.

In other words, complex traits, such as most disease outcomes but also those behavioural traits of interest to psychologists, are highly polygenic – that is, they are influenced by a very large number of common genetic variants of very small effect. This, in turn, explains why we have yet to reliably identify specific genetic variants associated with many psychological and behavioural traits – while the latest GWAS of traits such as height and weight (the GIANT Consortium) includes data on over 250,000 individuals, there exists no such collection of data on most psychological and behavioural traits. This situation is changing though – a recent GWAS of educational attainment combined data on over 125,000 individuals, and three genetic loci were identified with genomewide significance, although these were associated with very small effects (as we would expect). Excitingly, these findings have recently been replicated. Another large GWAS, this time of schizophrenia, identified 108 loci associated with the disease, putting this psychiatric condition on a par with traits such as height and weight in terms of our understanding of the underlying genetics.

The success of the GWAS method is remarkable – the recent schizophrenia GWAS, for example, has provided a number of intriguing new biological targets for further study. It should only be a matter of time (and sample size) before we begin to identify variants associated with personality, cognitive ability and so on. Once we do, we will understand more about the biological basis for these traits, and finally begin to account for the missing heritability.


Munafò, M.R., & Flint J. (2014). Schizophrenia: genesis of a complex disease. Nature, 511, 412-3.

Rietveld, C.A., et al. (2013). GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science340, 1467-71.



This blog first appeared on The Inquisitive Mind site on 18th October 2014.

Cochrane review says there’s insufficient evidence to tell whether fluoxetine is better or worse than other treatments for depression

Depression is common in primary care and associated with a substantial personal, social and societal burden. There is considerable ongoing controversy regarding whether antidepressant pharmacotherapy works and, in particular, for whom. One widely-prescribed antidepressant is fluoxetine (Prozac), an antidepressant of the selective serotonin reuptake inhibitors (SSRI) class. Although a number of more recent antidepressants are available, fluoxetine (which went off patent in 2001) remains highly popular and is commonly prescribed.

This systematic review and meta-analysis, published through the Cochrane Collaboration, compares the effects of fluoxetine for depression, compared with other SSRIs, tricyclic antidepressants (TCAs), selective noradrenaline reuptake inhibitors (SNRIs), monoamine oxidase inhibitors (MAOIs) and newer agents, as well as other conventional and unconventional agents. This is an important clinical question – different antidepressants have different efficacy and side effect profiles, but direct comparisons are relatively rare.


Thank goodness for systematic reviewers who read hundreds of papers and combine the results, so you don't have to

Thank goodness for systematic reviewers who read hundreds of papers and combine the results, so you don’t have to

The review focused on studies of adults with unipolar major depressive disorder (regardless of the specific diagnostic criteria used), searching major databases for studies published up to 11 May 2012.

All randomised controlled trials comparing fluoxetine with any other antidepressant (including non-conventional agents such as hypericum, also known as St John’s wort) were included. Both dichotomous (reduction of at least 50% on the Hamilton Depression Scale) and continuous (mean scores at the end of the trial or change score on depression measures) outcomes were considered.


A total of 171 studies were included in the analysis, conducted between 1984 and 2012 and comprising data on 24,868 participants.

A number of differences in efficacy and tolerability between fluoxetine and certain antidepressants were observed. However, these differences were typically small, so that the clinical meaning of these differences is not clear.

Moreover, the majority of studies failed to report detail on methodological procedures, and most were sponsored by pharmaceutical companies.

Both factors increase the risk of bias and overestimation of treatment effects.


The review

The review found sertraline and venlafaxine (and possibly other antidepressants) had a better efficacy profile than fluoxetine

The authors conclude that: “No definitive implications can be drawn from the studies’ results”.

There was some evidence for greater efficacy of sertraline and venlafaxine over fluoxetine, which may be clinically meaningful, but other considerations such as side-effect profile, patient acceptability and cost will also have a bearing on treatment decisions.

In other words, despite considerable effort and pooling all of the available evidence, we still can’t be certain whether one antidepressant is superior to another.

What this review really highlights is the ongoing difficulty in establishing whether some drugs are genuinely effective (and safe), because of publication bias against null results (Turner, 2008).

This situation is made worse when there are financial vested interests involved. Recently, there has been active discussion about how this problem can be resolved, for example by requiring pharmaceutical companies to release all data from clinical trials they conduct, irrespective of the nature of the findings.

Despite the mountains of trials published in this field, we still cannot say for sure which treatments work best for depression

Despite the mountains of trials published in this field, we still cannot say for sure which treatments work best for depression

Clinical decision making regarding the most appropriate medication to prescribe are complex, and made harder by the lack of direct comparisons. Moreover, the apparent efficacy of individual treatments may be inflated by publication bias. Direct comparisons between different treatments are therefore important, but remain relatively rare. This Cochrane Review provides very important information, even if only by highlighting how much we still don’t know about which treatments work best.


Magni LR, Purgato M, Gastaldon C, Papola D, Furukawa TA, Cipriani A, Barbui C. Fluoxetine versus other types of pharmacotherapy for depression. Cochrane Database of Systematic Reviews 2013, Issue 7. Art. No.: CD004185. DOI: 10.1002/14651858.CD004185.pub3.

Etchells, P. We don’t know if antidepressants work, so stop bashing them. The Guardian website, 15 Aug 2013.

Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med. 2008 Jan 17;358(3):252-60. doi: 10.1056/NEJMsa065779. [PubMed abstract]

This article first appeared on the Mental Elf website on 1st October 2013 and is posted by Marcus Munafo

“Doubt is our product…”

Cigarette smoking is addictive. Cigarette smoking causes lung cancer. Today these statements are uncontroversial, but it’s easy to forget that this was not the case until relatively recently. The first studies reporting a link between smoking and lung cancer appeared in the 1950’s (although scientists in Germany had reported a link earlier), while the addictiveness of tobacco, and the isolation of nicotine as the principal addictive constituent, was not established until some time later. Part of the reason for this is simply that scientific progress is generally slow, and scientists themselves are typically not the kind of people to get ahead of themselves.

However, another factor is that at every stage the tobacco industry has resisted the scientific evidence that has indicated the harms associated with the use of its products. One way in which it has done this is by suggesting that there is uncertainty around the core evidence base used to support tobacco control efforts. A 1969 Brown and Williamson document outlines this strategy: “Doubt is our product, since it is the best means of competing with the ‘body of fact’ [linking smoking with disease] that exists in the mind of the general public”.

This approach seeks to “neutralize the influence of academic scientists”, and has since been adopted more widely by other lobby groups. The energy industry has used a similar approach in response to consensus among climate scientists on the role of human activity in climate change. But what’s the problem? There are always a number of ways to interpret data, scientists will hold different theoretical positions despite being in possession of the same basic facts, people are entitled to their opinion… That’s fine, but the tobacco industry goes beyond this and actively misrepresents the facts. Why do I care? Because recently our research was misrepresented in this way…

There is ongoing debate around whether to introduce standardised packaging for tobacco products. Public health researchers mostly favour it, while the tobacco industry is opposed to it. No particular surprises there, but there’s a need for more research to inform the debate. We have done some research here in Bristol suggesting that standardised packs increase the prominence of health warnings in non-smokers and light smokers. Interestingly, we didn’t see this in regular smokers. This research contributed to the recent European Commission Tobacco Products Directive and the UK government consultation on standardised packaging. British American Tobacco (BAT) submitted a response to this consultation, which cited our research and said:

“The researchers concluded that daily smokers exhibited more eye movements towards health warnings when the pack was branded than when it was plain, but the opposite was true for non-smokers and non-daily smokers”.

We didn’t find that, and we didn’t say that. This isn’t a matter of interpretation or opinion – this is simple misrepresentation. What we actually concluded was:

“…among non-smokers and weekly … smokers, plain packaging increases visual attention towards health warning information and away from brand information. This effect is not observed among daily (i.e. established) cigarette smokers”.

In other words, standardised packaging increases the prominence of health warnings in non-smokers and light smokers, but don’t seem to have any effect in daily smokers. This is an important difference compared to how BAT represents this research. In their response to the consultation, BAT argues that “plain packaging may actually reduce smokers’ attention to warnings”. Of course it’s possible that there could be negative unintended consequences to standardised packaging, but there is no evidence in our study for this.

Why does this matter? Maybe it doesn’t – people get misrepresented all the time. But scientists produce data and ideas, the latter ideally based on the former, and so to misrepresent their conclusions is fundamentally distorting. Unfortunately this sort of thing happens all the time, including in media coverage of scientists’ work. This often makes scientists less willing to engage in important debates where they could make a valuable contribution. If this happens, then those with clear vested interests will succeed in removing valuable evidence from these debates. More importantly, this example illustrates why it’s vital that scientists do engage with the public and the media. Only by doing so can scientists make sure that their research is accurately represented, and that attempts to misrepresent their research are challenged.

As the health effects of smoking became apparent, successive governments acted to reduce the prevalence of smoking in the population. In the United Kingdom these efforts have been pretty successful – the overall prevalence of smoking is currently around 20%, down from a peak of over 50% in the 1950’s. This is due to restrictions on tobacco advertising, increases in taxation on tobacco products, and other tobacco control measures, as well as public health campaigns to increase awareness of the health consequences of tobacco use and greater availability of services to help people stop smoking. We want these policies to be evidence-based, and we don’t want this evidence to be knowingly distorted. Scientists have an important part to play in this.

Posted by Marcus Munafo @MarcusMunafo


Having confidence…

I’ve written previously about the problems associated with an unhealthy fixation on P-values in psychology. Although null hypothesis significance testing (NHST) remains the dominant approach, there are a number of important problems with it. Tressoldi and colleagues summarise some of these in a recent article.

First, NHST focuses on rejection of the null hypothesis at a pre-specified level of probability (typically 5%, or 0.05). The implicit assumption, therefore, is that we are only interested answering “Yes!” to questions of the form “Is there a difference from zero?”. What if we are interested in cases where the answer is “No!”? Since the null hypothesis is hypothetical and unobserved, NHST doesn’t allow us to conclude that the null hypothesis is true.

Second, P-values can vary widely when the same experiment is repeated (for example, because the participants you sample will be different each time) – in other words, it gives very unreliable information about whether a finding is likely to be reproducible. This is important in the context of recent concerns about the poor reproducibility of many scientific findings.

Third, with a large enough sample size we will always be able to reject the null hypothesis. No observed distribution is ever exactly consistent with the null hypothesis, and as sample size increases the likelihood of being able to reject the null increases. This means that trivial differences (for example, a difference in age of a few days) can lead to a P-value less than 0.05 in a large enough sample, despite the difference having no theoretical or practical importance.

The last point is particularly important, and relates to two other limitations. Namely, the P-value doesn’t tell us anything about how large an effect is (i.e., the effect size), or about how precise our estimate of the effect size is. Any measurement will include a degree of error, and it’s important to know how large this is likely to be.

There are a number of things that can be done to address these limitations. One is the routine reporting of effect size and confidence intervals. The confidence interval is essentially a measure of the reliability of our estimate of the effect size, and can be calculated for different ranges. A 95% confidence interval, for example, represents the range of values that we can be 95% confident that the true effect size in the underlying population lies within. Reporting the effect size and associated confidence interval therefore tells us both the likely magnitude of the observed effect, and the degree of precision associated with that estimate. The reporting of effect sizes and confidence intervals is recommended by a number of scientific organisations, including the American Psychological Association, and the International Committee of Medical Journal Editors.

How often does this happen in the best journals? Tressoldi and colleagues go on to assess the frequency with which effect sizes and confidence intervals are reported in some of the most prestigious journals, including Science, Nature, Lancet and New England Journal of Medicine. The results showed a clear split. Prestigious medical journals did reasonably well, with most selected articles reporting prospective power (Lancet 66%, New England Journal of Medicine 61%) and an effect size and associated confidence interval (Lancet 86%, New England Journal of Medicine 83%). However, non-medical journals did very poorly, with hardly any selected articles reporting prospective power (Science 0%, Nature 3%) or an effect size and associated confidence interval (Science 0%, Nature 3%). Conversely, these journals frequently (Science 42%, Nature 89%) reported P-values in the absence of any other information (such as prospective power, effect size or confidence intervals).

There are a number of reasons why we should be cautious when ranking journals according to metrics intended to reflect quality and convey a sense of prestige. One of these appears to be that many of the articles in the “best” journals neglect some simple reporting procedures for statistics. This may be for a number of reasons – editorial policy, common practices within a particular field, or article formats which encourage extreme brevity. Fortunately the situation appears to be improving – Nature recently introduced a methods reporting checklist for new submissions, which includes statistical power and sample size calculation. It’s not perfect (there’s no mention of effect size or confidence intervals, for example), but it’s a start…


Tressoldi, P.E., Giofré, D., Sella, F. & Cumming, G. (2013). High impact = high statistical standards? Not necessarily so. PLoS One, e56180.

Posted by Marcus Munafo