01Trust & Evidence

Trust & Evidence.

This is the evidence room. Every claim below carries a date and a way to check it — and every boundary is stated next to the claim it bounds.

DataSitr is a Saudi-hosted AI privacy gateway with public evidence for its current operating boundary. Customer traffic runs on Alibaba ACK Riyadh, the 4-hour May 4 soak passed, and the GCP Dammam drill-standby exercise proved DNS/GKE/TLS routing only. Cross-cloud DB replication, auth failover, regulator approval, HSM custody, and full-region tolerance are not yet claimed.

Standing as of 2026-06-02 dated

02The three things we measured

Proof, then its boundary.

Riyadh primary + Dammam drill standby Customer traffic flipped from legacy edge to Alibaba ACK Riyadh on 4 May 2026; GCP Dammam drill standby passed a scoped exercise on 16 May 2026.

DNS A records for datasitr.com and api.datasitr.com cut over to ACK ingress 8.213.49.193 on 2026-05-04T01:12:50Z UTC. A 4-hour soak passed without regression. A separate disposable drill hostname validated DNS-level switching to GCP Dammam ingress 34.110.171.105, TLS continuity on standby.gcp.datasitr.com, GKE workload health, evidence capture, and rollback. The Dammam footprint is cost-controlled drill infrastructure; cross-cloud DB replication and auth failover are not yet active.

attested · fixed facts
Control traceability 177 controls are tracked, split 148 code-test / 16 dated-evidence / 13 external-fact / 0 pending.

control_matrix.json · dated 2026-06-02 live

"0 pending" covers substantiation-class assignment; the matrix separately flags 5 controls with active coverage gaps (2 PDPL) under controls_with_coverage_gap in control_matrix.json.

KMS + detector v8 scorecard Alibaba KMS startup bootstrap remains live; the public detector v8 scorecard is ready with all 8 release gates passing.

Release gates: CHECKING

pii_benchmark_latest.json · dated 2026-04-29 live

Keep key-custody claims bounded to startup bootstrap on the serving ACK image. Detector v8 is live on the May 20 ACK runtime baseline; detector readiness is still the published in-repo scorecard, not an external audit or production-wide coverage guarantee.

03How to read this page

The grammar of this room.

Boundary

“Proven” means measured or drilled on the live pilot.

If a capability has not been exercised on the live pilot or cannot be reproduced from retained evidence, it stays outside the proof language.

Verification

Public artifacts are sanitized; full mappings are request-gated.

Use the public trust report, control-matrix summary, detector scorecard JSON, and benchmark JSON for buyer-safe review. Qualified reviewers can request the signed bundle for control-level implementation, test, and evidence mappings.

Not implied

Trust evidence stays tied to the surfaces actually exercised.

Where the evidence is narrower, the wording stays narrower: full-vault verification, HSM-backed custody, immutable-retention controls, and unplanned full-region failure tolerance remain explicit separate steps.

Live, attested, or a dated snapshot

This is also why some values below are marked live (fetched from a named artifact as you read), some are attested (fixed facts like IP addresses, timestamps, and the regulator's own wording), and one is deliberately not a number at all — where the honest artifact is a dated snapshot, we say so instead of freezing a figure into this page.

live attested dated snapshot

04Live pilot evidence, dated

Dated proof of a specific surface.

Start here for dated proof of a specific surface. Each row names the retained evidence and the citation to check. These rows are not an external audit, regulator approval, or HSM-custody claim.

April 21 guarded rollout baseline

Current route proof. The May 4 customer-route cutover baseline is the current public ACK/API route proof; the 2026-05-16 scoped Dammam drill-standby exercise adds DNS/GKE/TLS evidence only. Citation: evidence/ha/alibaba-live-2026-05-04T01:17:03Z/ and evidence/multi-region-drill/multi-region-warm-standby-20260516T220433Z/.

PDPL citation integrity

Citation source. The in-repo SDAIA-published PDPL English text backs automated article-reference checks across code and docs, not an external legal opinion. Citation: scripts/validate_pdpl_citations.py.

Billing integrity

Billing chain. Billing events use SHA-256 hash-chain continuity and HMAC authentication for newer records, while the 10-year retention gate refuses in-retention deletion. Citation: docs/billing-integrity.md and tests/test_billing_integrity.py.

PII detection

Saudi PII coverage. English, Arabic, and Saudi-specific recognizers cover National ID, IBAN, phone, and related patterns; the repository-side v8 Arabic NER bundle adds FAC-label coverage while measured results stay on the benchmark page. Citation: saudivault/saudi_patterns.py, docs/detector-v8-release-notes-2026-05-19.md, and /benchmark.

Three-lane privacy routing

Policy routing. Green tokenizes before external routing, Amber pseudonymizes in-Kingdom, and Red keeps raw processing in-Kingdom according to tenant policy. Citation: saudivault/policy.py.

Encrypted vault

Vault encryption. Stored token originals use AES-256-GCM with per-tenant derived keys, and transit uses TLS 1.2/1.3. Citation: saudivault/vault.py and nginx/datasitr.conf.

Guarded deploy with rollback

Rollback drill. A forced public-smoke failure restored the prior deploy hash and health, with the proof note retained in-repo and raw drill logs retained separately. Citation: docs/rollback-drill-evidence-20260326.md.

Signed evidence export

Signed export. Sequenced processing records can export signed verification material for reviewer use; final immutable-retention posture remains separate. Citation: saudivault/compliance.py.

Off-host encrypted backup

Backup evidence. Dated pilot notes cover encrypted upload, download, and restore-drill verification. Treat freshness as an operator-verified date, not a standing guarantee. Citation: docs/backup-hardening-summary-20260330.md and docs/off-host-backup-evidence-20260324.md.

Isolated restore recovery

Isolated restore. The March 28 drill restored an encrypted backup and completed fresh logins in isolation, proving single-stack recoverability only. Citation: docs/backup-hardening-summary-20260330.md.

Monitoring and alerting

Monitoring proof. The pilot has dated operator evidence for health checks, metrics, log retention, and alert delivery, with freshness treated as a dated check. Citation: evidence/ha/alibaba-live-2026-05-04T01:17:03Z/ and docs/ha-evidence-gate.md.

Shared-state scaling evidence

Scaled route evidence. Alibaba ACK has dated customer-route HA proof, and GCP Dammam has scoped DNS/GKE/TLS drill proof only. Citation: evidence/ha/alibaba-live-2026-05-04T01:17:03Z/ and evidence/multi-region-drill/.

Auth survivability

Auth-path drill. The March 29 drill showed fresh login and authenticated processing during an intentional auth-path outage; the Dammam drill covers scoped standby routing only. Citation: docs/customer-security-one-pager.md.

Public restored-state cutover

Restored-state check. The March 29 rerun verified that the oldest and newest restored vault rows decrypt successfully under the restored environment; it does not prove that every vault row decrypts or imply full-vault verification. Citation: evidence/restore-drill-20260504T124300Z/ and docs/customer-security-one-pager.md.

Alternate public path

Alternate path. The March 29 drill showed the public path served through an alternate host under operator control; this row remains historical continuity evidence. Citation: docs/evaluator-packet-worker2.md and docs/customer-security-one-pager.md.

Control matrix and public trust report

Matrix totals. The public report summarizes 177 controls (148 code-test / 16 dated-evidence / 13 external-fact / 0 pending) with reviewer-only mappings removed.

control_matrix.json · dated 2026-06-02 live

Citation: docs/generated/control_matrix.json · /trust-report

Detector benchmark artifacts

Curated detector benchmark. The public detector v8 scorecard is ready with all 8 gates passing, and the PII benchmark reports a p95 of 47.92 ms for the 1K-character English case.

pii_benchmark_latest.json · dated 2026-04-29 live · detector engine p95

Citation: /benchmark

05Regulatory standing, in writing

Regulatory standing, in writing.

attested wording, not a metric — there is no artifact to regenerate

Each row is the exact phrasing the issuing authority has put in writing. We deliberately distinguish "registered" from "licensed" and "applied" from "awarded" — because the Saudi regulators do.

National PDP Register #3260005651 #3260005651

Active. The owner is the registered Data Protection Officer for مؤسسة داتا ستر / Data Sitr Establishment under the Saudi Personal Data Protection Law (PDPL).

ACTIVE
NDGP Data Services Provider Registration LR-25-000018 LR-25-000018

Registered as a data services and products provider on the National Data Governance Platform (NDGP); status "Complete" on the dashboard. NDMO has clarified in writing (2026-04-27) that this registration does NOT constitute a license — the licensing application window will open in an upcoming phase.

REGISTERED — NOT LICENSED
SDAIA AI Service Provider Accreditation AE-26-000237 AE-26-000237

Reviewed and not awarded by the Saudi Data and AI Authority. The AE-26-000237 application (filed 2026-04-03) was not approved; we hold no SDAIA accreditation.

NOT AWARDED
Commercial Registration 4030483372 4030483372

Active under the Ministry of Commerce since 2022-08-31. Entity type: Establishment. Registered under the current name مؤسسة داتا ستر / Data Sitr Establishment.

ACTIVE
Unified National Number 7030618388 7030618388

700-prefix unified national number for the same Saudi establishment. It is not the Commercial Registration number.

IDENTIFIER

Enforcement environment: SDAIA confirmed 48 PDPL enforcement decisions in January 2026 covering unlawful processing, weak security controls, and unconsented marketing — administrative fines up to SAR 5 million, doubled for repeat violations, with intentional sensitive-data violations carrying up to two years' imprisonment.data-source-date 2026-01 · attested

06Available for buyer review

Available for buyer review.

Assurance surfaces a buyer or security team can inspect immediately — no widening of the claims boundary required.

Control Traceability Matrix

177 controls are mapped into substantiation classes that buyers can inspect safely.

The public trust report shows the aggregate proof counts, while the full Ed25519-signed reviewer bundle remains available to qualified reviewers on request.

Public Trust Report

A sanitized report now summarizes what the matrix proves without leaking file paths or line numbers.

Open the report at /trust-report or consume /resources/trust-report.json for automated review; the totals match the generated control-matrix JSON.

PDPL Citation Integrity

The authoritative SDAIA-published PDPL English text is included in-repo as the citation source of truth.

A per-citation validator at scripts/validate_pdpl_citations.py enables automated audit of article references across the codebase.

Live Key Custody

The May 4 customer-route cutover baseline (current) continues to bootstrap its startup master key through Alibaba KMS.

Keep that claim bounded to startup bootstrap on the serving ACK image. Tenant BYOK and HSM custody remain outside the current live boundary.

Edge WAF

Cloud Armor blocks OWASP Top 10 attack patterns at the Dammam drill-standby ingress during exercises.

Pre-configured rule sets for SQL injection, cross-site scripting, local and remote file inclusion, and remote code execution are attached to the GKE Ingress backend. Blocked-request counts are visible to the operator on the Cloud Monitoring dashboard.

Uptime monitoring

Cloud Monitoring uptime checks poll the Dammam drill-standby /healthz endpoint while the drill footprint is enabled.

An alert policy notifies the operator by email if the pass rate falls below 50% over a three-minute window. The current public status remains summarized at /status.

Vulnerability scan program

OWASP ZAP, nuclei, and Trivy run on a documented cadence with registered-DPO review of findings.

The scan workflow lives at .github/workflows/security-scan.yml; the program is documented at docs/security/vulnerability-scan-program.md. Findings are stored under evidence/security-scans/ and reviewed by the registered DPO before any change of public posture.

Security questionnaire library

A pre-compiled response library covers the categories enterprise procurement and vendor security teams ask about.

Library at docs/security/questionnaire-response-library.md spans company background, compliance posture, encryption and key management, access controls, incident response, subprocessors, data-subject rights, cross-border transfer, business continuity, audit access, and AI-specific routing details. Available to qualified buyers on request alongside the signed reviewer bundle.

Academy UX redesign

The Academy dashboard redesign is a guarded-deploy UI improvement; it does not change processing API behavior.

The changelog records the cleaner training navigation, companion PDPL course surface, and accessibility polish. The change is dashboard presentation only: no routing, tokenization, provider, or lawful-basis API semantics change.

HA evidence freshness gate

CI refuses to ship if the high-availability drill evidence is older than 168 hours.

Pinned drill artifacts in the deploy workflow enforce that the multi-region drill attestation is fresh. Stale evidence fails the gate and blocks the next deploy until a new drill is captured and signed with the published Ed25519 key.

  • Public reviewer artifactspublic trust report, control matrix summary, compliance reviewer pack, benchmark JSON artifacts, and the compliance summary page
  • Dashboard compliance tabprocessing records, DPIA, audit summary, evidence pack, and compliance bundle with copy/download for procurement review
  • Dedicated regulator portalread-only regulator access during evaluation by request, with cross-tenant processing records, SDAIA-shaped report builders, scoped signed-package generation for handoff artifacts, and a separate regulator access log

07Where the boundaries live

One page. One source of truth.

The constraints are centralized so every reviewer sees the same wording. If the proof is narrow, the claim stays narrow.

08Reproducible detector benchmarks

Reproducible, not asserted.

Every benchmark is generated from the in-repo detector by scripts/benchmark_pii.py against public or in-repo corpora. The detector v8 scorecard reports ready with all 8 release gates passing; the numbers below remain dated artifacts, not external-audit claims. The full per-type F1, the vendor comparison, and the uplift live on /benchmark — this page describes the corpora and how to regenerate them.

Wojood public Arabic NER

test split · 357 sentences · 17 entity types

Headline-PII micro F1 / precision / recall on a public Arabic named-entity corpus. The scored numbers — and the per-type breakdown — live on /benchmark.

WojoodNER 2024 winners report 0.91-0.92 fine-grained F1; we are within the academic SOTA range on the coarse-grained taxonomy. FAC is sample-size noise (12 gold spans), and v8 adds FAC labels in the local Arabic NER bundle.

python scripts/eval_wojood.py

Saudi code-switched corpus

Arabic ↔ English mixed text

Recall under the research-corpora gate on text that switches between Arabic and English mid-sentence. Scored result on /benchmark.

python scripts/benchmark_pii.py

Adversarial attacks v1

800 cases · OCR decay, ZWJ injection, code-embedded, transposition, homoglyph, label attack

must_detect recall and phantom-false-positive rate against deliberate evasion. Scored result on /benchmark.

python scripts/benchmark_pii.py

Arabic literary negative corpus

205 cases · Classical Arabic poetry / scripture / scholarly text

Clean rate — this corpus must NOT trigger structural PII. Regex-only is 0.00 because Wojood is Arabic prose without Saudi structural identifiers. Scored result on /benchmark.

python scripts/benchmark_pii.py

Frozen quality suite

12 internal eval packs · ~1,350 cases

Required-suites pass rate across the frozen regression set. Scored result on /benchmark. Methodology and model evolution are documented in docs/ner-v3-card.md through docs/ner-v7-card.md and docs/detector-v8-release-notes-2026-05-19.md (v3 → v8).

python scripts/benchmark_pii.py
Detector engine p95 — 1K-character English case

p95 47.92 ms · target ≤ 75 ms · PASS (30 iterations)

Release gates: CHECKING

pii_benchmark_latest.json · dated 2026-04-29 · detector engine, not gateway latency live

The reproducible vendor comparison is on /benchmark, generated from comparison_to_industry_2026-04-28.json. To verify: clone the repository, run scripts/benchmark_pii.py and scripts/eval_wojood.py, and compare against the artifacts in docs/generated/. The full benchmark history is in docs/generated/pii_benchmark_history/.

09Test coverage snapshot

Where a number would lie, we point at the snapshot.

Backend, dashboard, and production-build coverage. The verified snapshot is kept as a dated internal evidence note — so rather than freeze a percentage that would drift, we point you at the current dated snapshot.

Backend tests See current dated snapshot
dated snapshot
Dashboard unit / integration See current dated snapshot
dated snapshot
Dashboard production build Passing

The covered surfaces include PII detection, tokenization, vault encryption, pipeline orchestration, admin authorization, webhook delivery, monitor health, deploy / backup / restore scripts, and dashboard UI.

10How to verify

How to verify.

If you read this page top to bottom, the live values you saw were real fetches — here is the full checklist to run it cold from a clean clone. Every link points to a published artifact or a live product surface — no NDAs, no screenshots.

  1. Request a pilot API key and test detection and routing with their own representative data
  2. Open the public trust report and compare /resources/trust-report.json totals against /resources/control_matrix.json
  3. Review the compliance bundle in the dashboard (copy or download as JSON for internal review), or download branded evaluation PDFs from the resources page
  4. Check the evidence pack sections for integrity, external evidence, and policy snapshot status
  5. Request regulator-portal access when the evaluation requires cross-tenant evidence, SDAIA-shaped report builders, or scoped signed-package generation
  6. Verify scoped signed packages using the published verification details rather than relying on screenshots or forwarded files alone
  7. Ask about any published constraint on the compliance page — questions go to dpo@datasitr.com

See it work on your data.

Every number on this page was fetched live, or dated, or named for what it is. The next step is your own data on the live pilot.