Data handling & retention

How Line 21 handles data retention, AI training, and third-party processors in the live captioning pipeline.

Customer knowledge-base article | Version 1.0 | June 2026 | Review cycle: every six months, or when a subprocessor changes

Quick answers

Question	Answer
Is client content used to train third-party AI models?	No. Client content processed through the configured Line 21 pipeline is not used to train third-party AI models.
Can specific vendors be excluded?	Yes. Vendor selection can be constrained per client engagement where policy requires it.
Is EU data residency available?	Yes, depending on the selected speech-recognition and translation providers.
Is Zero Data Retention (ZDR) available?	Yes. ZDR agreements are available for OpenAI and Anthropic engagements that require them.
Are HLS recordings retained?	By default, HLS recordings are retained for 30 days on Line 21 infrastructure. This can be shortened or configured per client engagement.

This article explains how Line 21 handles client content in its live captioning, translation, and AI dubbing pipeline. It covers the main data categories processed, the third-party processors used, how long content may be retained, and whether submitted content is used for AI model training.

This page focuses on the AI and media-processing pipeline. Standard account, billing, and commercial records are handled separately and are not shared with AI vendors.

Our position: client content processed through the Line 21 pipeline is not used to train any third-party AI model. Vendors either commit to this in their standard terms or provide an opt-out that Line 21 has applied.

Data categories processed

Live audio/video streams — the client's event feed, processed in real time for transcription and delivery.
Transcripts, captions and translations — text derived from the audio, passed through correction and translation services.
Event context material — glossaries, speaker names and domain terminology supplied by the client to improve accuracy; held on Line 21 infrastructure.
Account and billing data — standard commercial records, outside the scope of the AI pipeline and not shared with AI vendors.

Important distinction: vendor retention refers to how long a third-party processor may temporarily retain submitted content. Line 21 retention refers to content stored on Line 21 infrastructure, such as HLS recordings or customer-supplied context material. These are governed separately and can be configured per client engagement.

Status labels used in this review

Status	Meaning
No training	The vendor does not use submitted customer content to train its models.
Opt-out applied	The vendor allows an opt-out and Line 21 has enabled it.
Under vendor clarification	Line 21 is confirming the vendor's exact AI-training position.
Line 21 controlled	Data remains on Line 21 infrastructure and is not shared with an AI vendor.

Summary of processor commitments

Speech recognition vendors

Vendor	Content received	Content retention	AI training status	Notes
Speechmatics	Audio for ASR	None	No training	Dashboard-level control.
Deepgram	Audio for ASR	None	Opt-out applied	EU endpoint available for data residency requirements.
Gladia	Audio for ASR	GDPR Standard Practice	No training	Compliance hub states customer data is not used for model training.
AWS Transcribe	Audio for ASR	Auto deletion window of 90 days; earlier available by API	Opt-out applied	AWS service improvement use is disabled for Line 21 account.

LLM correction and proofreading vendors

Vendor	Content received	Retention	AI training status	Notes
OpenAI API	Text	Up to 30 days for abuse monitoring only	No training	Zero Data Retention agreement available for qualifying engagements
Anthropic API	Text	30 days	No training	Zero Data Retention agreement available for qualifying engagements. Anonymised usage metrics contain no client content.

Machine translation vendors

Vendor	Content received	AI training status	Notes
DeepL API Pro	Text for translation	No training	GDPR-compliant; ISO 27001 and SOC 2 Type II certifications.
Google Cloud Translation API	Text for translation	No training	Advanced API offers regional endpoints.
Azure AI Translator	Text for translation	No training	GDPR-compliant; FedRAMP High certified.
OpenAI API	Text for translation	No training	Zero Data Retention agreement available for qualifying engagements

Recording capture and Line 21 storage

Service	Content received	Retention	AI training status	Notes
Attendee	Meeting audio/video and transcripts for online meeting capture	Audio/video retained for 5 days; transcripts retained until deletion	Under vendor clarification	Used only for specific online-meeting captioning workflows. Line 21 can exclude this workflow where client policy requires it.
Line 21 HLS recordings	Video streams delivered to an HLS destination	30 days by default on Line 21 infrastructure	Line 21 controlled, not used for training	Used for client playback and quality assurance. Retention and deletion-on-request are configurable per client engagement.

Client-configurable options

Line 21 is a UK company and processes client data in accordance with UK GDPR and EU GDPR. The processing pipeline can be configured by vendor, region, and retention requirement.

EU data residency — EU endpoints are available for selected ASR and translation providers, including Deepgram EU and regionalised translation endpoints.
Zero Data Retention — ZDR agreements are available with LLM providers such as OpenAI and Anthropic for engagements processing sensitive content.
Vendor exclusion — Specific vendors can be excluded from a client's pipeline configuration where internal policy requires it.
Custom retention — HLS recordings and event context material can be shortened, deleted on request, or configured to meet client requirements.

When Attendee is used

Attendee is only used where Line 21 captions online meetings via a meeting bot. It is not part of every live-event workflow. Line 21 is clarifying Attendee's AI-training position directly with the vendor and evaluating customer-side storage options, under which recordings would reside on Line 21 infrastructure under Line 21's own retention policy.

Review and contact

Last reviewed: June 2026
Review cycle: every six months, or whenever a subprocessor changes
Owner: Fabrizio Ruggeri, CTO and Co-Founder

For questions about data handling on a specific engagement, contact us.

Appendix: vendor reference notes

These notes are intended for security, procurement, and legal reviewers who need the more detailed vendor-by-vendor position behind the public knowledge-base article.

Speechmatics

Realtime SaaS does not store audio, transcripts, or configuration data after processing completes. The workspace-level data-retention toggle is disabled in Line 21's account, so Line 21 traffic is not retained for service improvement. Line 21 does not route client content through Speechmatics' TTS preview product.

Deepgram

Deepgram's default terms permit temporary storage of submitted audio for model improvement. Line 21 opts out programmatically via the mip_opt_out parameter on every request. With the flag set, audio is neither retained nor used for training. Deepgram offers a dedicated EU endpoint for data-residency requirements.

Gladia

Gladia commits contractually to never training on customer data and operates GDPR-compliant retention practices. No content-retention period beyond standard GDPR practice is published.

AWS Transcribe

Transcription jobs are automatically deleted after 90 days and can be deleted earlier via the Delete API. AWS's default permission to use content for service improvement is disabled for Line 21's account via the AWS Organizations AI-services opt-out policy.

OpenAI API

API inputs and outputs are retained by OpenAI for up to 30 days solely for abuse monitoring, then deleted. Line 21 sets store:false on API calls, so content is not stored for later retrieval. Under standard API terms, submitted data is not used to train models. Zero Data Retention is available where required.

Anthropic API

Standard retention is 30 days. Anthropic's standard API terms do not use customer data to train models. A Zero Data Retention agreement is available for engagements that require it. Anonymised usage metrics contain no client content.

DeepL API Pro

Content is not written to persistent storage and is deleted immediately after translation. In exceptional error-debugging cases, content may be held encrypted for a maximum of 72 hours, then automatically deleted. API Pro texts are never used for model training.

Google Cloud Translation API

Submitted text is not stored beyond what is required to serve the API response. Google states that content sent to the Cloud Translation API is not used to train or improve its translation models. The Advanced API offers regionalised endpoints for data-location control.

Azure AI Translator

Azure AI Translator operates a no trace policy by default for API traffic. Customer content is not written to persistent storage and no record is kept in Microsoft data centres. Submitted text and audio are not used for training.

Line 21 HLS recordings

Video streams delivered to an HLS destination are recorded and retained for 30 days on Line 21 infrastructure for client playback and quality assurance. This data is under Line 21's direct control, is not shared with AI vendors, and is not used for training. Retention and deletion-on-request are configurable per client engagement.

Data handling & retention

On this page