Phase 2: Screen and select studies

Screening converts a noisy pile of database exports into a defensible set of included studies. It has two stages — title/abstract and full text — governed by the same inclusion and exclusion criteria, and both stages require dual independent reviewers for any review claiming to be systematic. This phase produces the single most recognisable artefact of a systematic review: the PRISMA 2020 flow diagram.

1. Lock your inclusion and exclusion criteria before you start

Criteria are derived from the PICO/PEO/PCC framework you built during the search strategy phase. Criteria must be operational — two reviewers reading the same abstract should reach the same decision. Ambiguous criteria ("high-quality studies") are unusable; concrete criteria ("randomised trials with parallel or cluster design, n ≥ 50 per arm") are usable.

A minimum criteria set covers: population, intervention/exposure, comparator, outcomes, study design, setting, language, publication type, and date range. Download the Inclusion/Exclusion Criteria Checklist from our templates library and fill it in before you screen a single record. Any mid-screening change must be logged with a date, a reason, and a re-screen of already-decided records.

2. Stage 1: title and abstract screening

Two reviewers independently screen every record against the criteria using INCLUDE, EXCLUDE, or MAYBE. At this stage, err toward inclusion: if you cannot tell from the abstract that a study is ineligible, move it to full-text screening. Typical yield is 10–30% of the deduplicated corpus.

Practical workflow:

Import your deduplicated RIS/NBIB into Covidence, Rayyan, or DistillerSR.
Blind the two reviewers to each other's decisions.
Reviewers tag each record; the tool surfaces conflicts automatically.
Resolve conflicts by consensus discussion; if unresolved, a third reviewer adjudicates.
Export the decisions with reasons.

3. Stage 2: full-text screening

Retrieve full texts for every record that survived stage 1. Two reviewers independently re-screen against the same criteria, but this time every exclusion must carry a reason (wrong population, wrong intervention, wrong outcome, wrong design, duplicate report of same study, no full text available). PRISMA 2020 requires you to report the count excluded at full text by reason.

Budget roughly 10–15 minutes per full text at this stage. If you are excluding more than 80% at full text, your stage-1 criteria were probably too loose — document the issue and keep going.

4. Inter-rater reliability: Cohen's kappa

Agreement between reviewers should be measured, not assumed. The standard statistic is Cohen's κ, which corrects raw percentage agreement for chance:

κ = (p_o − p_e) / (1 − p_e)

where p_o is observed agreement and p_e is expected agreement by chance. Common benchmarks (Landis & Koch, 1977):

κ	Agreement
< 0.00	Poor
0.01–0.20	Slight
0.21–0.40	Fair
0.41–0.60	Moderate
0.61–0.80	Substantial
0.81–1.00	Almost perfect

Aim for κ ≥ 0.60 at title/abstract and ≥ 0.70 at full text. If your pilot kappa is low, pause, refine the criteria, re-train the reviewers, and re-screen. Do not paper over low kappa with post-hoc consensus — fix the criteria.

Calculate kappa on a pilot subset of 50–100 records before full screening, and again on the final dataset. Covidence and Rayyan compute kappa automatically.

5. Build the PRISMA 2020 flow diagram

The flow diagram tells the reader, in one visual, how you went from thousands of records to your final synthesis set. PRISMA 2020 separates the diagram into four stages: Identification, Screening, Eligibility, and Included.

Identification

842 records identified from databases

Duplicates removed: 214

Screening

628 records screened (title/abstract)

Excluded at title/abstract: 492

Eligibility

136 full-text articles assessed

Full-text excluded: 98

Wrong population (34)

Wrong intervention (27)

Wrong outcome (21)

Non-English without translation (16)

Included

38 studies included in synthesis

Illustrative PRISMA 2020 flow diagram. Counts are examples for pedagogical use.

Every count in the diagram must be traceable to a screening export. The exclusion reasons listed at the Eligibility stage must match those recorded during full-text screening. An editable PRISMA flow diagram template is available in the templates library.

6. Screening tools compared

Covidence — purpose-built for Cochrane-style systematic reviews; strong dual-screening and extraction; paid, with institutional licences.
Rayyan — free tier; fast title/abstract screening with a blinded mode; AI-assisted ranking.
DistillerSR — enterprise tool for large, regulated reviews (HTA, policy); very flexible workflow; paid.
EndNote + shared library — viable for small reviews; lacks blinded dual screening.

Pick the tool that matches the review's scope and the team's budget. Document the choice in your protocol.

7. Common screening pitfalls

Single-reviewer screening. Not systematic. Either dual-screen or re-label your review as narrative using our narrative review guide.
Changing criteria mid-screen without re-screening. Introduces selection bias.
Excluding at title/abstract for lack of information. Move ambiguous records to full text.
Ignoring duplicates from different databases. Always deduplicate before screening.
Undocumented exclusion reasons at full text. Violates PRISMA 2020.

Tools and templates for this phase

Inclusion/Exclusion Criteria Checklist — lock your criteria before you screen.
PRISMA Flow Diagram Template — editable, PRISMA 2020-compliant.
Related review types: systematic reviews, scoping reviews, rapid reviews.

Next phase

With your final included set locked and the PRISMA diagram drafted, move to Phase 3 — Data Extraction →