Skip to content

Copilot Privacy Notice

Last updated: 2026-05-07

This page explains what the CSA-in-a-Box Copilot widget collects, why, how long we keep it, and how to opt out.

TL;DR

  • We log your questions and the Copilot's answers so we can spot documentation gaps and keep improving the model's grounding.
  • We redact common secrets (emails, API tokens, JWTs, IPs, Azure connection strings) on the server before anything is written to storage. Best-effort; please don't paste production secrets into a public chat widget.
  • Your IP address is hashed with a salt before storage. We use the hash to deduplicate analytics — it's not reversible without the salt.
  • Raw chat content (your messages + the Copilot's replies) is kept for 90 days, then auto-deleted via Cosmos DB time-to-live.
  • Aggregated metrics (counts, latencies, token usage, opt-out rate) are kept indefinitely.
  • You can opt out at any time. Opting out is sticky to your browser via localStorage.

What we collect

Chat widget (only when you actively use it)

Field Source Why
User message (redacted) What you type into the widget Identify uncovered topics; improve answers
Assistant reply (redacted) Generated by Azure OpenAI Pair with the question for retrieval-quality analysis
Page URL + title The docs page you were on when you asked Understand which sections drive questions
Hashed source IP Salted SHA-256 of X-Forwarded-For's rightmost entry Rate-limit analytics, repeat-user grouping
Latency, tokens, citations Server-side measurements Performance analytics
Session + conversation IDs UUIDs generated client-side Stitch a multi-turn conversation together
Thumbs-up/down + improvement text (redacted) What you submit via the feedback strip Triage poor answers
Backlog submissions Use-case requests, bug reports, doc-gap reports Filed as GitHub Issues for triage

Docs site (page-load analytics)

When you visit any page on the docs site, anonymised page-load telemetry is sent to the same Application Insights resource:

Field Source Why
Page path (no query string) window.location.pathname See which docs pages get traffic
Coarse geographic info Azure Monitor's IP-derived approximation (city/country) Understand global usage
Browser + OS family User agent Compatibility decisions
Page load duration Browser perf API Site-performance monitoring
Anonymous session ID App Insights generates per-tab Group consecutive page views from the same visit

No cookies are set by the analytics. The Application Insights JS SDK is configured with disableCookiesUsage: true. We do not assign a persistent identifier — every browser tab is a fresh session.

Honors Do Not Track automatically. If your browser sends the DNT: 1 header (Firefox: Settings → Privacy → "Send websites a Do Not Track signal"), the analytics SDK doesn't load. You also opt out implicitly when you opt out of chat tracking — the same localStorage flag covers both.

What we do not collect

  • Your name, email, GitHub username, organisation, or any identifier beyond a hashed IP.
  • Cookies. The widget does not set cookies.
  • Cross-site tracking pixels.
  • Anything from outside the chat panel (no scroll tracking, no DOM observation, no analytics SDKs).

Where it lives

Stream Destination Retention
Operational metrics Application Insights (appi-csa-inabox-copilot-fg) 90 days (Azure Monitor default)
Chat content Azure Cosmos DB (copilot/conversations) 90 days (TTL)
Feedback Azure Cosmos DB (copilot/feedback) Indefinite (delete on request)
Backlog submissions Azure Cosmos DB (copilot/backlog) → GitHub Issues Indefinite (filed publicly on issue creation)

All telemetry, chat content, and feedback are stored in Microsoft Azure Commercial — US East 2 region, in tenant infrastructure operated by the project owner. Specific resource identifiers (tenant ID, subscription name, resource-group name, resource names) are intentionally omitted from this notice — they're operational details documented internally in azure-functions/copilot-chat/DEPLOYMENT.md for project maintainers, not user-facing privacy disclosures.

If you require the specific Azure resource details for a compliance or data-residency audit, file an Issue with the privacy label and we'll share them through an authenticated channel.

How redaction works

Before persistence, the backend runs a small library of regex matchers over the user message, assistant reply, and any improvement text:

  • JSON Web Tokens (eyJ...)
  • Provider-prefixed credentials (sk-..., xoxb-..., ghp_..., AIza..., hf_..., etc.)
  • Bearer tokens
  • Azure connection strings (AccountKey=..., SharedAccessKey=...)
  • Azure 88-character base64 keys
  • Email addresses
  • IPv4 addresses
  • Long opaque tokens (≥40 chars of [A-Za-z0-9_-])

Anything matched is replaced with [redacted]. This is best-effort: novel credential formats may slip through. Treat the chat surface as public — never paste production secrets.

Opting out

The first time you open the Copilot, a banner asks you to Accept or Opt out. Your choice is stored locally in your browser (localStorage key csa.copilot.privacy.v1) and respected on every subsequent request via the X-Copilot-Opt-Out: 1 header.

When you've opted out:

  • The chat still works normally.
  • The backend skips all persistence and all App Insights events for your requests.
  • The hashed IP is still computed for the session-only rate-limiter, but it is not written to long-term storage.

To change your mind, clear browser storage for the docs site or use Clear site data in your browser's developer tools.

Deletion requests

If you'd like a record purged outside the 90-day TTL:

  1. Open a GitHub Issue with the label privacy.
  2. Include your session ID (visible in the widget if you open browser devtools and check the X-Copilot-Session header on a request).
  3. We'll delete the matching documents from Cosmos and confirm in the issue thread.

Changes to this notice

We'll bump the Last updated date at the top of this page when the collection set, retention, or destinations change. Material changes (new collection field; longer retention; new destination outside Azure or GitHub) will require a fresh banner acceptance — we treat the banner state as version-keyed.

Source code

The full pipeline is open source: