Security
Most AI tools filter sensitive content after the AI has already read it. That's not security. That's hope.
InternalWiki enforces permissions at the database query level. Unauthorised documents never enter the AI's context. Not filtered after generation — prevented before retrieval. The AI cannot leak what it never saw.
Two approaches to AI permissions. Only one is deterministic.
What most AI tools do
Probabilistic. The AI has already seen sensitive documents.
What InternalWiki does
Deterministic. The AI never sees unauthorised content.
This isn't a feature. It's an architecture decision that cannot be retrofitted.
This is the query that runs on every question
Every question to InternalWiki executes this permission check. No exceptions. No cache bypass. No admin override.
1SELECT chunks.content,2 chunks.metadata,3 sources.title,4 sources.freshness_status5FROM chunks6JOIN sources7 ON chunks.source_id = sources.id8JOIN permissions9 ON sources.id = permissions.source_id10WHERE permissions.user_id = $current_user11 AND permissions.access_level >= 'read'12 AND similarity(13 chunks.embedding,14 $query_embedding15 ) > 0.316ORDER BY similarity DESC17LIMIT 10;Hover any line to see what it does and why it's there.
Security at every layer
Permissions are the headline. Here's everything else.
Encryption at rest
All data encrypted with AES-256 at the storage layer. Document chunks, vector embeddings, metadata, and permission records are stored in Neon PostgreSQL with encryption enabled by default. Even with direct database access, data is unreadable without the encryption keys — which are managed separately and rotated automatically.
Encryption in transit
Every connection uses TLS 1.3. Between your browser and our servers. Between our servers and OpenAI's API. Between our servers and your data sources (Google Drive, Slack, SharePoint). No unencrypted data moves at any point in the pipeline. Certificate pinning enforced on all external API calls.
No training on your data
Your documents are never used to train AI models. We use OpenAI's API with data processing agreements that explicitly prohibit training on customer data. This is contractual, not just a policy — it's a legal obligation. When we introduce additional model providers, the same DPA requirement applies.
Token encryption
OAuth access tokens for your connected sources are encrypted at rest using a dedicated TOKEN_ENCRYPTION_KEY, separate from the database encryption. Tokens are refreshed automatically and can be revoked instantly by disconnecting a source.
What happens to your data at every stage
From the moment a document enters InternalWiki to the moment you disconnect — every stage is designed to minimise exposure and maximise your control.
Ingestion
Documents fetched via OAuth from your connected sources. Raw content is chunked and embedded as vectors in memory. The raw file is never stored — only chunked passages and their vectors are persisted.
↑ 1,247 documents processed
Storage
Encrypted at rest with AES-256. Chunked passages and vector embeddings stored in PostgreSQL with pgvector. Permission records synced from source systems. Automatic backups and point-in-time recovery.
AES-256 · pgvector · 99.9% uptime
Retrieval
Permission check runs on every query. No cached answers bypass the permission join. If a document's permissions changed 30 seconds ago, the next query reflects that change.
Avg query: 0.3s · Permission checks: 100%
Deletion
Disconnect a source and all data is permanently deleted within 24 hours: chunks, embeddings, metadata, permission records, sync state. No residual data. No tombstoned records.
24hr SLA · Zero residual
Ingestion
Documents fetched via OAuth from your connected sources. Raw content is chunked and embedded as vectors in memory. The raw file is never stored — only chunked passages and their vectors are persisted.
↑ 1,247 documents processed
Storage
Encrypted at rest with AES-256. Chunked passages and vector embeddings stored in PostgreSQL with pgvector. Permission records synced from source systems. Automatic backups and point-in-time recovery.
AES-256 · pgvector · 99.9% uptime
Retrieval
Permission check runs on every query. No cached answers bypass the permission join. If a document's permissions changed 30 seconds ago, the next query reflects that change.
Avg query: 0.3s · Permission checks: 100%
Deletion
Disconnect a source and all data is permanently deleted within 24 hours: chunks, embeddings, metadata, permission records, sync state. No residual data. No tombstoned records.
24hr SLA · Zero residual
Every question. Every source. Every permission check. Logged.
The audit trail doesn't just log what was returned. It logs what was EXCLUDED and why.
Every entry is searchable by user, date range, source document, confidence level, and permission outcome. Filter to see every query that accessed a specific document. Filter to see every permission denial for a specific user. Export as JSON or CSV for your compliance team or external auditors.
Logs retained for 90 days on Team plans. 1 year on Enterprise. Custom retention available.
Compliance status
Actively working toward full compliance. Here's where we stand.
GDPR
ReadyData processing agreements available for all customers. Right to deletion fully supported and automated. Data subject access requests supported through the admin panel.
Data Residency
US-EastAll data stored in US-East (AWS us-east-1 via Neon). Database, embeddings, metadata, and permission records all reside in the same region.
SOC 2 Type II
In progressIndependent third-party audit of security controls, availability, processing integrity, and confidentiality. Covers the full InternalWiki infrastructure.
ISO 27001
PlannedInternational information security standard. Evaluation begins after SOC 2 completion.
Responsible AI means showing your work
We don't use the phrase ‘responsible AI’ as a marketing checkbox. For InternalWiki, it's an architecture principle: the AI should be accountable to the human using it. Every design decision — citations, confidence scores, freshness indicators, permission enforcement — exists to keep the human in control of what they trust.
No hallucinated claims
Every fact in every answer is anchored to a specific passage in a specific document. If the AI can't find a source for a claim, the claim doesn't appear in the answer. This is enforced at the generation level, not the output level.
Confidence scoring that means something
The confidence score isn't how sure the AI is. It's how much evidence exists. 94% means multiple current sources agree. 31% means weak evidence — and the Trust Panel tells you to check with a human instead of acting on the answer.
Human routing when the AI isn't enough
When confidence drops below threshold, the Trust Panel suggests who to talk to — the document author, the subject-matter expert, the person most likely to know. The AI steps back and points you to a person. That's responsible AI: knowing when to defer.
Security questions we hear most
No. When a document is ingested, it's split into smaller passages (chunks), each converted into a vector embedding. The raw document content is processed in memory and discarded. Only the chunked passages and their embeddings are stored. InternalWiki cannot reconstruct your original files from what it stores.
All stored data is encrypted at rest with AES-256. Document chunks without the encryption key are unreadable. OAuth tokens are encrypted with a separate key. Permission records would reveal that a user has access to a source — but not the content of that source. Vector embeddings alone cannot be reverse-engineered into readable text.
Production database access is restricted to a minimal set of infrastructure roles with audit logging. All stored content is encrypted at rest. Employee access to production systems requires multi-factor authentication and is logged. We do not have a mechanism to query your documents ad hoc — the permission enforcement applies to internal access as well.
Permission changes in Google Drive, Slack, and SharePoint are detected during sync cycles. Most permission changes are reflected within minutes. If a document's sharing is revoked in Google Drive at 9:00 AM, by 9:05 AM queries from unauthorised users will no longer retrieve that document's content.
On Enterprise plans, yes. We support custom model configuration for organisations with specific AI governance requirements. The permission enforcement, citation system, and Trust Panel work identically regardless of which model generates the answer.
Yes. Available for all paid plans. Contact hello@internalwiki.com and we'll send the DPA for review and signature. Our DPA covers data storage, processing, retention, deletion, sub-processors, and breach notification procedures.
OpenAI (answer generation — data not used for training), Neon (PostgreSQL database hosting), Vercel (application hosting and CDN), Clerk (authentication). A full sub-processor list with data handling details is available in our DPA.
Have security questions? We'd rather answer them than have you wonder.
We're happy to walk through our architecture in detail. Bring your security team — we've done this before.