Skip to content

Data Protection Impact Assessment (DPIA)

This DPIA covers the document-review processing carried out by The Counsel (the hosted dashboard at the-counsel.co.uk). It follows the structure of the ICO DPIA template. Self-hosted deployments should adapt this document to their own infrastructure and complete their own sign-off.

Controller: The Counsel (the-counsel.co.uk) · Contact: privacy@the-counsel.co.ukLast reviewed: 11 June 2026


Step 1 · Identify the need for a DPIA

The service analyses legal documents that users upload. Legal documents are unpredictable in content: a tenancy agreement, employment contract or dispute bundle can incidentally contain special category data (health, union membership, criminal allegations) and data about third parties who are not the user. Processing involves several AI providers acting as processors. Innovative-technology use plus the potential for sensitive content makes a DPIA appropriate under UK GDPR Article 35.

Step 2 · Describe the processing

Nature

The pipeline is user-initiated, per document:

  1. Upload — the user uploads a PDF/DOCX or pastes text on /review.
  2. Extraction — text is extracted locally in the application (no third party). For scanned PDFs, page images go to an OCR provider: Cloudflare Workers AI, with chained fallback to OpenAI; a Firecrawl provider option also exists and is run with its zeroDataRetention flag enabled.
  3. AI analysis — the extracted text is sent to Anthropic or OpenAI (per the user's provider selection) for clause, risk and compliance analysis.
  4. Storage — review results are stored as rows in Neon Postgres; the original file in Cloudflare R2. Optionally, a review summary is sent to Google Gemini TTS to generate an audio briefing, which is cached in R2.

Scope — data categories

CategoryExamplesSource
Account dataEmail, name, profile imageClerk sign-up
Document contentAnything a legal document can contain — including potential special category data and third-party personal dataUser upload
Derived analysisClause findings, risk scores, recommendations, audio narrationGenerated
Operational dataUsage counters, review-run telemetry, extraction telemetry (no document text), request logsGenerated

Context and purposes

Users are individuals and small businesses in England & Wales reviewing their own legal documents. The sole purpose is to deliver the analysis the user explicitly requests. Documents are not used to train models — by us or, under their API business terms, by Anthropic or OpenAI.

Step 3 · Consultation

Single-developer product; no DPO is appointed (not required at current scale). User-facing processing is described in the public privacy notice, and users can raise concerns via privacy@the-counsel.co.uk. Processor terms (DPAs) are linked from the privacy notice.

Step 4 · Necessity and proportionality

  • Necessity: the service cannot function without processing the document — analysing its text is the service. There is no less-intrusive alternative that still delivers a clause-level review.
  • Lawful basis: performance of a contract (the review the user requests); legitimate interests for security and fair-use limits.
  • Proportionality: processing is user-initiated per document; nothing is analysed in the background. Document content goes to AI providers only at the moment of review. Telemetry deliberately excludes document text. Users control deletion (per review, per matter, or whole account) and can export their data from Settings.

Step 5 · Identify and assess risks

#RiskLikelihoodSeverityOverall
R1Cross-tenant access — one user reads or writes another user's documents/reviews via a tampered identifierLowHighMedium
R2Processor retention — document content persists at an AI provider beyond the reviewLowMediumMedium
R3Scanned personal documents — OCR sends page images (which may show ID documents, signatures, handwriting) to a third-party providerMediumMediumMedium
R4Output leakage / prompt injection — adversarial text inside an uploaded document steers the model into reproducing other context or off-purpose content in the review outputLowMediumLow–Medium
R5Orphaned data after account deletion — files or rows survive an account erasureLowHighMedium
R6Runaway hosted usage — abuse of hosted keys causes excessive processing volumeMediumLowLow

Step 6 · Measures to reduce risk

All measures below exist in the codebase today.

RiskMeasure
R1Every non-user row carries a userId (defence-in-depth tenancy); routes that accept a client-supplied matterId must pass resolveOwnedMatterId, which confirms ownership before any write; [id] API routes guard against malformed identifiers.
R2Providers are used under API business terms that exclude training on customer data; OpenAI's API abuse-monitoring retention is capped at ≤30 days; the Firecrawl OCR path runs with zeroDataRetention enabled.
R3OCR is invoked only for scanned documents the user explicitly submits; document-extraction telemetry contains no document text or images and is rotated within 90 days.
R4Reviews are scoped to a single document per run — the model receives only the submitted document and the system prompt, so there is no cross-user context to leak; outputs are rendered as data, not executed.
R5Account deletion cascades: Postgres cascade deletes wire user → matter → review/document and review → audio briefing, and the account-deletion route erases the corresponding R2 objects; a Clerk user.deleted webhook performs the same erasure if the account is deleted at the identity provider.
R6Hosted-trial usage is quota-capped (monthly and per-feature daily limits, with per-company overrides), so a compromised session cannot drive unbounded processing.
AllOperational usage records and review-run telemetry are deleted after 365 days by a retention job; users can self-serve a full export of their data (Settings → Export my data) and full erasure at any time.

Step 7 · Residual risk and sign-off

Residual risk: low. The remaining exposure is inherent to the service: document content must transit AI providers to be analysed, and users may upload documents containing third-party or special-category data that we cannot screen in advance. These are accepted on the basis of the contractual safeguards above, user-initiated processing, and self-service erasure/export.

No residual high risk is identified, so prior consultation with the ICO under Article 36 is not required.

ItemName / dateNotes
Measures approved by[controller — name, date]Integrate actions back into project plan
Residual risks approved by[controller — name, date]If accepting any high residual risk, consult the ICO before going live
DPIA review date[controller — set + 12 months]Re-run on any change to providers, retention windows or data categories

AI Legal UK · The Counsel — Established MMXXVI · Built for England & Wales · Not legal advice.