BACKGROUND
Daily clinical decisions are usually based on personal experience and evidence available from scientific studies. It is therefore, imperative that publications provide not only precise information regarding the methodology used and the results obtained; published articles should also be structured to facilitate their reading comprehension.
The first experience with this model was the CONSORT statement, published in 1996 (Begg et al., 1996), revised in 2001 (Moher et al., 2001) and 2004 (Campbell et al., 2004); and updated in 2010 (No authors listed, 2010); the objective in this model was to improve the quality of clinical trials (CT) and randomized clinical trial (RCT) report; becoming the example to follow, subsequently encouraging the motivation of various research groups to generate proposals aimed at improving the reporting of research results.
There are a number of reasons that recommendations, guidelines, checklists and scales for authors are needed. To begin with, authors are faced with the responsibility of persuading reviewers and scientific journal editors of the quality of their study. Undoubtedly in this process an adequate investigation is critical, though a proper report of the objectives, design, eligibility criteria, sample size and type of sampling among others, is no less important. These are some examples of information that will allow a reader to critically evaluate the study. Giving insufficiently information could be confusing the reader but giving too much information could overstate a vain problem.
On the other hand, there are some instruments aimed at evaluating quality methodological quality (MQ) or risk of bias of published articles.
The aim of our study was to describe statements, recommendations, proposals, guidelines, checklists and scales available for reporting results and quality of conduct in biomedical research.
MATERIAL AND METHOD
This manuscript was written following the PRISMA statement (Moher et al., 2009).
Study Design. Systematic review (SR).
Eligibility criteria. All types of statements, recommendations, proposals, guidelines, checklists and scales to improve the quality of biomedical research results reporting, as of 1996 were included. No language restriction was considered. Exclusion criteria were not considered.
Data Source. A search was made in the databases EMBASE, HINARI, MEDLINE and Redalyc; in the libraries BIREMEBVS, SciELO and The Cochrane Library; in the metasearchers Clinical Evidence and TRIP Database; and on the Websites of EQUATOR Network, BMC Medical Education and EUROPE PMC. The closing date was August 30, 2019.
Search. Sensitive search strategies were carried out in the available literature, without restriction of the year, language or state of the publication (published, unpublished, in process of publication). For this, MeSH or DeCS terms were used, free terms, Boolean operators AND/OR, truncation and limits. Full electronic search strategy for each data source are summarized in Table I.
Publication selection. The process for selecting studies included identification, screening, eligibility, and final inclusion of primary studies in the SR.
Data collection process. The review of the articles was carried out in three stages, first the titles were reviewed, then the summaries and subsequently the complete texts. This process was carried out by two groups of three researchers each (Group 1: CM, GQ and PS. Group 2: TO, MM and NG). Disagrees were resolved by consensus between the two review groups. Articles that initially coincided with the inclusion criteria were selected for extensive review of the texts.
Data items. A data extraction form was used that included information on the name, year of publication, number of items, assigned score domains, versions, objective, type of study design and observations. The recovered documents were grouped by groups of study designs (systematic reviews (SR) meta-analysis and meta-reviews, CT and RCTs and quasi-experimental studies, observational studies, diagnostic accuracy studies, clinical practice guidelines; biological material, animal and preclinical studies; qualitative studies; economic evaluation and decision analysis studies; and MQ scales).
Summary measures. No statistical tools were used, because it is a qualitative SR.
Ethics. Names of authors and centers were masked.
RESULTS
The search made it possible to retrieve 1233 documents, 189 of which were duplicated between the sources selected. After reviewing the requirements, 93 documents were achieved that make up the population under study, as could be seen in the flow diagram (Fig. 1). These are described below and in further detail in Table II.
Systematic reviews, meta-analysis and meta-reviews. A total of 7 checklists, 11 extensions, and 1 update were obtained (n=19).
1. QUOROM Statement. Its objective was to create a tool for the reporting of SR results based on CT. Composed of 6 domains (title/summary, introduction, methods, results and meta-analysis discussion) and 18 items, which include a flow diagram (Moher et al., 1999).
2. MOOSE Proposal. Its objective was to develop an instrument with recommendations for the meta-analysis of observational studies. Composed of 35 items, grouped into 6 domains (Background, Search strategy, Methods, Results, Discussion, Conclusions) (Stroup et al., 2000).
3. AMSTAR Statement. It is a measurement tool to assess the methodological quality of SR that include 11 items (Shea et al., 2007). In 2017, the version AMSTAR 2 for SR that include randomized or non-randomized studies of healthcare interventions was published, including 16 items, with simpler response categories than the original AMSTAR (Shea et al., 2017).
4. PRISMA Statement. It is the QUOROM update (Stroup et al.). Its objective was to resolve conceptual and practical advances of SRs. Composed of 27 items, grouped into 7 domains (title/summary, introduction, methods, results, discussion and financing) (Moher et al., 2009). It comprises a series of extensions, including: PRISMAEquity, published in 2012 (Welch et al., 2012) and updated in 2015 as PRISMA-E 2012, for SR and meta-analyses with a focus on health equity, defined as the absence of avoidable and unfair inequalities in health (Welch et al., 2016); PRISMA-C, published in 2014, as protocols for SR and meta-analyses of RCT or observational studies of newborn and child health research (Kapadia et al., 2016); PRISMA-IPD, an extension for SR and Meta-Analyses of individual participant data, published in 2015 (Stewart et al., 2015); PRISMA-NMA, an extension statement for SR incorporating network meta-analyses of health care interventions, published in 2015 (Hutton et al., 2015); PRISMA-RR, for report of rapid reviews, including those with analogous terminology (e.g. rapid evidence synthesis, rapid knowledge synthesis), published in 2015 (Stevens, 2015); PRESS, published in 2008-2010 an updated in 2016, as a guide to improve the peer review of electronic literature search strategies (McGowan et al., 2016); PRISMA-Search, for report literature searches in SR, published in 2016 (Rethlefsen et al., 2016); PRISMA-TCM, for report SR and meta-analyses of studies that evaluate chinese herb medicine or moxibustion, published in 2016 (Bian et al., 2016); PRISMA-ScR, for report SR and MetaAnalysis for scoping reviews, used to map the concepts underpinning a research area and the main sources and types of evidence available; was published in 2018 (Tricco et al., 2018); PRISMA-DTA, reported in 2015, for reports of SR and meta-analyses of diagnostic test accuracy studies (McInnes et al., 2018), and PRISMA-P, constituted by 17 items and 26 sub-items, published in 2015, with the objective of prepare SR protocols that summarize aggregate data from studies, especially evaluations of intervention effects (Moher et al., 2015).
5. MARQ Checklist. Its objective was to develop an instrument that evaluated the methodological quality of meta-reviews, to promote a transparent and consistent reporting of metareview methodology. It consists of 20 items grouped in 7 domains (Singh et al., 2012).
6. GRAPH Recommendations. Its aim was to design and report heart rate variability studies in psychiatry and which will expand the ability to perform meta-analyses and metaresearch in this area. It consists of 13 items distributed in 4 domains (Stevens et al., 2016).
7. ROBIS tool. For assessing the risk of bias in SR. Was aimed at 4 broad categories of reviews mainly within health care settings: interventions, diagnosis, prognosis, and etiology. It is compound by 5 domains and 24 items presented as questions (Whiting et al., 2016).
CTs, RCTs and quasi-experimental studies. A total of 12 checklists or statements, 17 extensions, 2 updates, and 1 protocol were obtained (n=32).
1. CONSORT Statements. Published in 1996 (Begg et al.) and updated in 2010 (No authors listed, 2010). Its objective was to improve the quality of the clinical trial report. Composed of 22 items grouped into 5 domains (title/ summary, introduction, methods, results and discussion). It includes a series of extensions and supplements, among which: STRICTA, published in 2001, with the objective of It aim was create a checklist for reporting RCT in acupuncture, with 6 items, applicable together with CONSORT statement (MacPherson et al., 2001); RedHot, whose objective was to create an instrument for reporting homeopathic treatments (Dean et al., 2007); NPT List, published in 2005 (Boutron et al., 2005) and updated in 2017. Its objective was to evaluate the quality of nonpharmacological treatment CTs. It consists of 10 items and 5 sub-elements, which are evaluated as: Yes, No, Not clear (Boutron et al., 2017). CONSORT-PRO, whose objective was to determine the results reported by the patients (PRO), which are usually inadequately reported, thus limiting the value of the data (Calvert et al., 2013); CONSORT-SPI, published in 2013 (Montgomery et al., 2013), and updated in 2018, for reporting randomized clinical trials (RCTs) of social and psychological interventions, extends 9 of the 25 items from CONSORT 2010 (CONSORT 2010), added a new item related to stakeholder involvement, and modified aspects of the flow diagram (Montgomery et al., 2018); IMPRINT, which seeks to improve CT information of infertility treatments (Harbin Consensus Conference Workshop Group et al., 2014); TIDIER checklist, for the report of interventions in evaluative studies, including CT (Hoffmann et al., 2014); adaptation to CT in orthodontics (Pandis et al., 2015); the "n-de-1", to evaluate the effectiveness of an intervention in a single patient (Vohra et al., 2016); PAFS, for the report of randomized pilot and feasibility trials, added 11 items grouped in 7 domains (Eldridge et al., 2016); KCONSORT (2009) renamed STORK standards (2016), to generate a standard for reporting results in intervention studies where they were going to be used Kampo Products (Motoo et al., 2017); protocol for a scoping review to support development of a CONSORT extension for RCTs using cohorts and routinely collected health data, published in 2018 (Kwakkenbos et al., 2018); SW-CRT, published in 2018, for reporting of stepped wedge cluster RCT consist in 40 items grouped in 26 domains (Hemming et al., 2018); ADs, published in 2018, extension for adaptive design RCT, adjusting 24 items of 16 domains of the CONSORT 2010 (Dimairo et al., 2018); MAPGRT for reporting of Multi-Arm Parallel-Group RCT, expanding on 10 items of the CONSORT 2010 (Juszczak et al., 2019); PRT for reporting within person RCT, it extends 16 items of the CONSORT 2010 checklist and introduces a modified flowchart and baseline table (Pandis et al., 2019). None of them considers score allocation.
2. TREND Statement. Its objective was to generate a tool for CT analysis when it was not possible to perform random assignment. This was composed of 21 items, grouped into 5 domains (Des Jarlais et al., 2004).
3. GNOSIS Guide. Its objective was to standardize the neuro-oncology CT report of phase 1 and 2. It consists of 7 domains and 18 items (Chang et al., 2005).
4. ISPOR RCT Report. Published in 2005 (Ramsey et al., 2005) and updated in 2015. Its objective was to serve as an orientation guide for the design, implementation and presentation of cost-effectiveness analysis reports in the CT. It has 5 domains (design, information elements, database, analysis and report of results), which group 26 items. It does not contain a numerical rating scale (Ramsey et al., 2015).
5. Newcastle-Ottawa Scale (NOS). Its objective was to assess the quality of non-randomized trials in metaanalyses. Its evaluation is currently in progress (Stang et al., 2010).
6. REFLECT Statement. Its objective was to improve the CT report related to "livestock and food safety". Composed of 5 domains and 22 items that include a flow diagram of the participants (Sargeant et al., 2010).
7. Ottawa Declaration. Its objective was to provide guidelines for the ethics of design and CT control by conglomerates. It is composed of 7 domains (design, review by ethics committee, participants, informed consent, access controller, risk-benefit assessment, and protection of participant vulnerability) (Weijer et al., 2012).
9. SPIRIT Statement. Its objective was to improve the quality of CT protocols. It consists of 33 items grouped into 5 domains (administrative information, introduction, methods, ethics and dissemination, and appendices) (Chan et al., 2013). It have one developed extension: SPIRIT-C, for trials in Child Health, with 11 domians (ClyburneSherin et al., 2015).
10. SPAC Therapy Checklist. Its objective was to develop a checklist for trials with alternative therapeutic interventions. It consists of 19 items that are answered with a Likert scale with scores of 1 (in disagreement), up to 9 (in agreement) (Kamioka et al., 2013).
11. StaRI Statement and Checklist. Its aim was to create a statement for reporting implementation studies. Consists of 27 items grouped in 9 domains (Pinnock et al., 2017).
12.TRIALS Guidelines. Its objective was to generate a checklist for reporting embedded recruitment trials. It consists of 36 items grouped into 25 domains (Madurasinghe et al., 2016).
13. ROBINS-I Tool. It is the preferred tool to be used in Cochrane Reviews for non-randomized studies of interventions, currently available for cohort designs with adaptions underway for other study types such as case control and interrupted time series. ROBINS-I overlap with RoB 2, the ‘Risk of bias’ 2.0 tool but include 3 additional domains: confounding, selection of participants into the study and classification at intervention (solid domain in clinical epidemiology are needed to use it) (Sterne et al., 2016).
Observational studies. A total of 5 checklist or statements and 6 extension (n=11).
1. STROBE Statement. Its objective was to develop a checklist for the reporting of research results made with cohort studies, cases and controls; and of cross section. It consists of 6 domains (title/summary, introduction, methodology, results, discussion and others), and 22 items (von Elm et al., 2007). Different versions are provided according to the design. It has an extension called STREGA, published in 2009 (Little et al., 2009), whose objective was to provide own items of studies of genetic association (genotyping, the model of the haplotype, fundamentals for the selection of genes, etc.). Other extensions are: STROBE-nut: published in 2016, as a list of recommendations for reporting nutritional epidemiology and dietary assessment research (24 recommendations for nutritional epidemiology grouped in 6 domains, were added to the STROBE checklist) (Lachat et al., 2016). INSPIRE Guideline: Published in 2016 (Cheng et al., 2016), extension of the STROBE statements and the CONSORT Standards; for writing guidelines to improve the quality of reporting for simulation-based research. STROME-ID statement: Published in 2014 (Field et al., 2014), for support scientific reporting of molecular epidemiological studies to inspire authors to consider specific threats to valid inference (20 items were added to the 22 item of the STROBE checklist). STROBE-Vet statement: Published in 2016 (Sargeant et al., 2016), for reporting requirements for observational studies in veterinary medicine related to health, production, welfare, and food safety. Modifications or additions were made to 16 items of STROBE statements (only in 6 items of it, no modifications were applied). RECORD, to help researchers who use health data collected routinely (for research in clinical epidemiology), to comply with ethical obligations of complete and accurate reports. It consists of 13 items that complement or modify the items of STROBE (Nicholls et al., 2016).
2. ORION Statement. Its objective was to raise the level of research and publication in hospital epidemiology related to nosocomial infections. Composed of 22 items, grouped into 5 domains (title/summary, introduction, methods, results and discussion), and a summary table (Stone et al., 2007).
3. STNS Score. Its objective was to generate a proposal to evaluate the quality of reports of surgical interventions in the treatment of trigeminal neuralgia. Was partially based on STROBE. It consists of 30 items grouped into 3 domains; and assigns points to their items (0 to 30 points) (Akram et al., 2013).
4. MInCir-ODS Initiative. Published in 2013 and updated in 2017 (Manterola & Otzen, 2017; Manterola et al., 2018). Its objective was to build a checklist for the report of results with observational descriptive studies. Composed of 19 items, grouped into 4 domains: Introduction, methodology, results and discussion.
5. GATHER Statement. Created with the objective of define and promote good practice in reporting of global health estimates (decision makers and researchers). It comprised 18 items grouped in 6 domains (Stevens et al., 2016).
Diagnostic accuracy studies. Six checklists or proposals, 1 extension and 3 updates were retrieved (n=10).
1. STARD Guidelines. Published in 2003 (Bossuyt et al., 2003) and updated in 2015 (Bossuyt et al., 2015). Its objective was to generate a standard for the report of studies of diagnostic accuracy. Composed of 30 items grouped in 6 domains (title/summary, introduction, methods, results and discussion), a flow diagram and score assignment. In 2015, an extension named ARDENT checklist was created to establish tools for standardized design and reporting of diagnostic accuracy studies of liver fibrosis tests. It consists of 27 items grouped in 5 domains (Boursier et al., 2015).
2. QUADAS Tool. Published in 2003 (Whiting et al., 2003), updated in 2011 (QUADAS-2) (Whiting et al., 2011). Its objective was to generate a tool for quality assessment of diagnostic precision studies included in an SR. Based on original QUADAS and evidence on sources of bias and variation of studies of diagnostic accuracy. It is applied in 4 phases: summary of the question, adaptation to the study being analyzed, flow chart for the primary studies; and assessment of the risk of bias and applicability.
3. QAREL Tool. Published in 2010 (Lucas et al., 2010), updated in 2013 (Lucas et al., 2013). Its objective was to develop a reliability assessment tool for diagnostic test studies, which could also be used in SR diagnostic tests. Composed of 7 domains (spectrum of subjects, examiners, masking of theexaminer, interval between measurements, application and interpretation of the test, order of the examination and analysis of the data) and 11 items. It is applied based on questions of 3 answer alternatives "yes" (good quality), "no" (poor quality), "not clear"; and some articles include the option '' not applicable ".
4. GRRAS Guidelines. Its objective was to develop a tool that would cover the information regarding reliability and agreement in measurements, especially in healthcare. Composed by 15 items grouped in 6 domains (Kottner et al., 2011).
5. TRIPOD Statement. Its objective was to improve reporting transparency of a prediction model study for individual prognosis or diagnosis, regardless of the study methods used. It consists of 6 dimensions and 22 items (Collins et al., 2015).
6. APOSTEL Recommendations. Its objective was to develop consensus recommendations for the presentation of results of optical quantitative tomography studies. It consists of 9 items (Cruz-Herranz et al., 2016).
Clinical practice guidelines. Two checklists and 1 update were retrieved (n=3).
1. AGREE Instrument. Published in 2003 (AGREE Collaboration, 2003), and updated in 2010 as AGREE-II (Brouwers et al., 2010). Its objective was to advance in the development, presentation of reports and evaluation of guidelines in health care through the generation of clinical practice guidelines. It consists of 23 items grouped into 6 domains (Scope and Objective, Participation of stakeholders, Rigor of preparation, Clarity of presentation, Applicability, and Editorial Independence).
2. RIGHT Statement. It objective was to generate an instrument for reporting Practice Guidelines in Health Care. Consist in 28 items grouped in 5 domains (Chen et al., 2017).
Biological material, animal and preclinical studies. Nine guidelines and proposals, and 1 update were retrieved (n=10).
1. MIAME Guidelines. Its objective was to establish a standard to register and report gene expression database on microarrays, thus facilitating the establishment of databases and allowing the development of data analysis tools. Composed by 6 domains (experimental design, matrix, samples, hybridization, measurement, and normalization of controls) (Brazma et al., 2001).
2. REMARK Guideline. Its objective was to generate recommendations for the publication of studies on tumor markers for prognostic models. Composed of 20 items grouped in 4 domains. Contemplate punctuation when applying the instrument; its maximum is 20 points (McShane et al., 2005).
3. SQUIRE Guidelines. Published in 2008 and updated as SQUIRE 2.0 in 2016. Its objective was to improve the biomedical scientific information reports. Composed of 19 items, grouped into 6 domains (title/summary, introduction, method, results, discussion and others) (Ogrinc et al., 2016).
4. REHBaR Proposal. Its objective was to develop a list of criteria to improve the quality of reporting results in homeopathy basic research. Composed of 23 items, grouped into 4 domains (Stock-Schröer et al., 2009).
5. ARRIVE Guidelines. Its objective was to maximize the published information and minimize unnecessary studies in animals. Composed of 20 items grouped into 5 domains (Kilkenny et al., 2010).
6. GRIPS Statement. Its objective was to improve the quality of the report of genetic risk prediction studies. Composed of 25 items, grouped into 6 domains. For each item, the specific type of information is described, as well as the minimum content that must be reported (Janssens et al., 2011).
7. CARE Guidelines. Its objective was to implement a guide for the reporting of data analysis in case report. It consists of 13 items (Gagnier et al., 2013).
8. AQUA checklist. Developed for reporting original anatomical studies. Consisted of 29 items divided into 8 domains (Tomaszewski et al., 2017).
9. PREPARE Guidelines. Its objective was to reinforce the planning stage of animal experiments. It consists of three domains: formulation; dialogue between scientists and animal facilities; and quality control of the study components (Smith et al., 2018).
Qualitative studies. Four checklist or statements and one update were recovered (n=5).
1. COREQ Checklist. Its objective was to prepare a checklist for the report of the results of qualitative studies (interviews and focus groups). Composed of 3 domains (research and reflexivity team, design, and analysis of data and reports) and 32 items (Tong et al., 2007).
2. ENTREQ Statement. Its objective was to help researchers inform the stages associated with the synthesis of qualitative health research: search and selection of qualitative research, quality assessment and methods to synthesize qualitative findings. It consists of 5 domains and 21 items (Tong et al., 2012).
3. GREET Statement. Published 2013 (Phillips et al., 2013), updated in 2016 (Phillips et al., 2016). Its objective was to provide guidance for the reporting of educational interventions for evidence-based practice. It consists of 17 items grouped into 6 domains (descriptive, participants, intervention, content, evaluation and confusion), assigning 3 response possibilities (fully informed, partially informed, not informed).
4. SRQR Recommendations. Its objective was to improve the transparency of all aspects of qualitative research. It consists of 5 dimensions and 21 items (O`Brien et al., 2014).
Economic evaluation and decision analysis studies. Three documents were recovered (n=3).
1. NHS-HTA Recommendations. Its objective was to develop recommendations to increase the generalization of economic evaluations. It consists of: recommendations to report results of economic evaluations of CT (composed of 8 items); a checklist for evaluation of the generalization of CT-based studies (composed of 10 items); and other, for the evaluation of the generalization of modeling studies (composed of 7 items) (Drummond et al., 2005).
2. NICE-STA Report. Its objective was to provide a checklist to evaluate the quality of economic health reports, especially STA decision analysis models, incorporating elements for economic evaluation. Composed of 46 items, grouped into 7 domains (relevance to current technology, structure, clinical evidence, data utility, use of resources and cost data, uncertainty assessment and consistency); with 4 response options (yes, no, it does not appear and not clear) and comments (Zimovetz & Wolowacz, 2009).
3. CHEERS Statement. Its objective was to develop recommendations to facilitate the reporting of economic evaluation publications. It consists of 24 items grouped into 6 domains (title / summary, introduction, methods, results, discussion and others) (Husereau et al., 2013).
Finally, it can state that almost 64 guidelines, proposals and checklist are in develop process or in protocol phase (15 CT and CONSORT extensions, 12 observational studies and STROBE extensions, 10 SR and PRISMA extensions, 2 CT protocols and SPIRIT extensions; and 25 other study designs and clinical areas) (EQUATOR).
DISCUSSION
As a summary of the evidence, we think that there is an important number and a variety of checklists available for the reporting of results in biomedical research, which can be used by authors, reviewers and editors, all aimed to improve the quality of the report of scientific articles. These could be interesting and relevant to researchers, which need to know the various options for reporting their results according to the type of study.
The publication of the documents described above (Table II), underscores the current trend oriented toward adequate reporting of results in biomedical research, regardless of the type of designs used. Whether through the use of checklists, check-ups or verification, these are all instruments that include criteria to evaluate certain characteristics that represent the minimum quality features required for a manuscript.
As possible limitations of the study, it seems to us that, as it may occur in any SR, we think that this study could have risk of publication and reporting bias, as well as incomplete retrieval of identified research. For example, we know that there are at least 50 proposals and checklist in develop process or in protocol phase, only in Equator (Equator Network, 2020). And perhaps others we could not found in other data sources.
However, it is important to point out that checklists were not designed to assess MQ, only the compliance with some parameters; for the MQ construct (a concept that allows assessment of the different aspects of an article, such as type of design, population, methodology, report quality etc.), is evaluated with ad-hoc scales such as some of those previously mentioned, that could also be used as checklists.
As a conclusion, we can point out that there is an important number and a variety of checklists available for the reporting of results in biomedical research, which can be used by authors, reviewers and editors, all aimed to improve the quality of the report of scientific articles.