Image credit: Photo by Erol Ahmed on Unsplash

How worthy are mine excellent papers?

Quality assessment criteria for research outputs: The case of the REF2020

Image credit: Photo by Erol Ahmed on Unsplash

How worthy are mine excellent papers?

Quality assessment criteria for research outputs: The case of the REF2020

Table of Contents

How would you evaluate whether your latest research output was an excellent one? Most often than not, people answer this question with some reference to an index of journal quality, either in terms of the journal impact factor quartile in the InCites journal citation reports (available from the Web of Science platform) or using the number of stars in the Academic Journal Guide available from the Chartered Association of Business Schools website.

In fairness, this is not a completely invalid answer: the impact factor or other bibliometric indices relative to the journals in which academic articles are published may constitute a somewhat indirect measure of the originality, significance and rigour of the work published. But it is only a proxy criteria for any given paper. And it can be invalid: some journals with high impact factors may not have a reputation for publishing rigorous research, citations do not always reflect quality (some articles can be highly cited because they are heavily criticised) and several prestigious journals have been known to retract publications indicating that severely flawed papers can make it to so-called “top journals.”

So what are we left with? Well, we need to develop the ability to explain directly the extent to which our contribution to knowledge is excellent. How do we go about doing this? There are many points of entry.

Here I chose the UK framework for evaluating research excellence and how it can be used to inform your personal assessment of the quality of your research outputs. In the UK, the quality of research in Higher Education Institutions (HEIs) is assessed by the Research Excellence Framework (REF) system. As outlined in the working methods, each HEI is preparing submissions for its units of assessment which will include a portfolio of several components providing information on staff, outputs, impact, and research environment. In line with the topic of this post, I focus more specifically on the assessment of the research outputs.

Important disclaimer

This is my personal take on the official REF guidance documentation. You should always consult your institution research office and/or refer to the guidance document for the final say on these topics. If you spot a blarring mistake, please leave a comment below!

How is the quality of research outputs assessed in the REF?

Quality standards

As explained in the working methods,

” The assessment process is based on expert review. (…) The process is objective and evidence based but is not and cannot be purely algorithmic.” p. 7.

The quality of a research output is to be assessed in terms of three criteria: its originality, its significance, and its rigour. The assessment is made with reference to (starred) international research quality standards. The starred quality levels have both generic definitions as well as descriptive accounts specific to each subject panel.

The generic starred levels of quality standards are defined as follow:

Quality Definition
4* Quality that is world-leading in terms of originality, significance and rigour.
3* Quality that is internationally excellent in terms of originality, significance and rigour but which falls short of the highest standards of excellence.
2* Quality that is recognised internationally in terms of originality, significance and rigour.
1* Quality that is recognised nationally in terms of originality, significance and rigour.
Unclassified Quality that falls below the standard of nationally recognised work. Or work which does not meet the published definition of research for the purposes of this assessment.

Quality criteria

For the purpose of the REF, originality is defined as:

the extent to which the output makes an important and innovative contribution to understanding and knowledge in the field.

Originality can result from the production and interpretation of new empirical findings but can also result from other forms of innovations such as contributing to new and innovative research methods and analytical techniques, advance theory, policy or practice.

Similarly significance is defined as:

the extent to which the work has influenced, or has the capacity to influence, knowledge and scholarly thought, or the development and understanding of policy and/or practice.

Finally, rigour is defined as:

the extent to which the work demonstrates intellectual coherence and integrity, and adopts robust and appropriate concepts, analyses, sources, theories and/or methodologies.

The descriptive accounts of the starred level definitions are set out by main panels to inform subject communities about how the panels will apply the definitions in making their judgements. Descriptive variations reflect disciplinary norms rather than a difference in the quality standards themselves. I review two descriptive variations: those from main panel A, which includes Psychology, Psychiatry and Neuroscience (UoA 4) and those from main panel C, which includes Business and Management Studies (UoA 17).

Main panel A descriptive account

This panel will assess scientific rigour in terms of design, method, execution, and analysis as well as the logical coherence of the argument. The extent to which the research made use of registered reports, pre-registration, publication of data sets, experimental materials, analytic code as well as the use of reporting checklists relating to publication or the use of animals in research will contribute to the evaluation of rigour. International quality guidelines include the ARRIVE guidelines for animal research, the CONSORT guidelines for health research and medical trials, the PRISMA guidelines for systematic reviews and meta-analyses, the COPE guidelines for publication ethics, the ICMJE recommendations for best practice and ethical standards in the conduct and reporting of medical research, and ITHENTICATE to detect and prevent plagiarism.

Significance and originality of new and replication studies will be assessed in terms of the extent to which the output makes important and innovative (originality) and influential (significance) contribution to knowledge and theory-building as well as practice in areas including (but presumably not limited to) education, management, health, healthcare, public health, food security, or animal health or welfare policy.

Citation information for a given output (but not journal impact factor and other information related to journal ranking) will inform assessment of significance. This is a complicated area and, according to guidance from the Forum for Responsible Research Metrics, there are no clear best practice. The Forum clearly points out that Google Scholar Citations as well as alternative web indicators such as Altmetrics, web metrics, or download indicators are all “spammable” and are not reliable indicators of significance. Reading between the lines, citation information from Scopus or the Web of Science can be more relevant but the issue in using them becomes their interpretation (e.g., what represents a significant number of citations?).

Main Panel C descriptive account

Main Panel C provides additional information about quality standards, which can be summarised as follows:

Quality Definition
4* Outstandingly novel in developing concepts, paradigms, techniques or outcomes, a primary or essential point of reference and a formative influence on the intellectual agenda with application of exceptionally rigorous research design and techniques of investigation and analysis and the generation of an exceptionally significant data set or research resource
3* Novel in developing concepts, paradigms, techniques or outcomes, an important point of reference, contributing very important knowledge, ideas and techniques which are likely to have a lasting influence on the intellectual agenda, application of robust and appropriate research design and techniques of investigation and analysis and the generation of a substantial data set or research resource.
2* Providing important, incremental and cumulative advances in knowledge and the application of such knowledge with a thorough and professional application of appropriate research design and techniques of investigation and analysis.
1* Providing useful knowledge, but unlikely to have more than a minor influence, largely framed by existing paradigms or traditions of enquiry, and providing competent application of appropriate research design and techniques of investigation and analysis.

Novelty thus appear a key discriminant feature of 4* and 3* quality, together with the quality of being perceived as an influential point of reference and the extent to which the output features rigorous research methods and generates a significant data set.

What’s in it for me? How to gauge the quality of your own outputs

As the REF descriptive accounts highlight, ‘World-leading’ quality refers to an absolute standard of quality and will be specific to each unit of assessment. While these criteria implicitly assume that there exists a consensus on what constitutes an absolute standard of quality in research, in reality this is likely to vary from one peer-reviewer to another, depending on their disciplinary background. They are also likely to change across time. For example, the importance of open science practices has gained considerable support in the past few years in disciplines represented under Main Panel A.

It is important that you develop a very clear understanding of what constitutes current absolute standards of quality in your own sub-discipline.

Engage with state-of-the art intellectual agenda in your field

To learn about absolute standards of quality in your area of research, keep abreast of the latest developments in your field, of what are other researchers currently focussing on (and what are they missing…). Seek to understand what’s original and what’s not, what are considered important and current topics of research, and what’s considered best research practice.

To do so, identify key conferences, leading researchers, talk to editors of influential journals if you can, or listen to their podcasts, or read their editorials.

To illustrate, consider recent editorials of Psychological Science, all focused on open science pratices and research rigour, from replication and pre-registered direct replications to sharing data and materials. This suggests that the criteria for excellent psychological science research is evolving and that the robustness of the methods underpinning the research (including current best practices such as pre-registration, sample size determination, and statistical power) and the associated reliability of the findings are likely to feature prominently among indices of rigour.

Likewise, the latest editorial of the journal of Experimental Psychology: Applied also mentions the need to address issues of replicability of findings and focuses on the importance of “use-inspired” basic research balancing the need for theories, testable hypotheses, and the use of “authentic” tasks in representative environments, as well as participants who are knowledgeable about these tasks and environments. In terms of rigour, the editor notes the importance of experimental methods supporting causal inferences as the hallmark of methodological rigor while also pointing to the need to “keep pace with analytic and statistical advances” (p. 2) and provides example of what those advances are: multilevel modeling and or the use of Bayesian approaches to facilitate the design of informative experiments.

Reflect on the importance of your research

To assess the importance of your research, ask yourself to what extent it:

  • produced and interpret new or significant empirical findings or new material?
  • engaged with new and/or complex problems?
  • developped innovative research methods, methodologies and analytical techniques?
  • showed imaginative and creative scope?
  • provided new arguments and/or new forms of expression, formal innovations, interpretations and/or insights?
  • collected and engaged with novel types of data?
  • advanced theory or the analysis of doctrine, policy or practice, and new forms of expression?

Additional common REF FAQs about outputs

What are the census dates for my outputs to be eligible.

Research outputs must have been published between 1 January 2014 and 31 December 2020 to be eligible.

How many outputs will I be expected to return?

The average number of outputs expected from a full-time staff who is engaged in research is 2.5, rounded to the nearest whole number. This can be interpreted as meaning that the average expected output performance during the census is 3 research outputs per staff. Any given member of staff can be “returned” with a minimum of one output and a maximum of five outputs. You may have additional outputs returned beyond your core five but they can only be returned if they can be attributed to your co-authors (provided they themselves do not have 5 papers to return).

These new guidelines are meant to allow HEIs to present their best selection of staff outputs.

What counts as an output?

Defined broadly, an output is the product of an investigation leading to new insights that is disseminated in the form of assessable research outputs and confidential research reports.

In reality, what counts as a significant output may vary from one unit of assessment to another.

Have I missed something? Do you disagree with some of my points? Are there other important avenues for assessing and improving the quality of our research? If you think this post could be developped further, free to comment and make suggestions below. I will acknowledge all comments which led me to update this content.

Avatar
Gaëlle Vallée-Tourangeau
Professor of Behavioural Science

Related

comments powered by Disqus