Overview
This document is intended for security, data governance, and architecture review teams evaluating the use of Provalytics through a custom MCP connector in ChatGPT. Download Security Review PDF It covers:- the system architecture for the integration
- the data architecture and data schema exposed through the MCP server
- the detailed operating model for how ChatGPT accesses Provalytics data
- the controls and limitations built into the implementation
Executive summary
The Provalytics ChatGPT integration does not send a full database dump to ChatGPT. Instead, ChatGPT connects to the Prova MCP server and makes on-demand, read-only tool calls for specific reporting and modeling questions. Each API key is scoped to a single client workspace, and the MCP server returns only the data that the authenticated key is permitted to access. Key design points:- access is read-only
- access is client-scoped
- tool calls are made on demand, not as a persistent bulk sync
- ChatGPT does not connect directly to Provalytics databases
- the customer tool surface exposes reporting and model outputs, not administrative controls or write operations
1. Detailed description of data used for ChatGPT integration
The ChatGPT integration uses the Provalytics MCP server as a controlled data-access layer. When a user asks a question in ChatGPT, ChatGPT calls one or more Provalytics MCP tools, such as:get_incrementalityget_recommendationsget_campaign_performanceget_model_predictionsget_model_statisticsget_days_to_conversionget_cpmget_categoriesget_methodology
What data is typically returned
The current customer-facing MCP tool set is designed around aggregated reporting and model outputs, including:- channel-level spend
- channel-level incrementality
- recommendation and forecast outputs
- campaign or hierarchy-level performance rollups
- model validation metrics such as
R²andMAPE - model-predicted vs actual time series
- CPM and impression-share analysis
- days-to-conversion metrics
- category and subcategory mappings
- static methodology text
What is not sent by design
The MCP integration is not designed to expose:- write access into Provalytics
- raw database access
- unrestricted SQL access
- administrative UI actions
- source-system credentials
- Provalytics user passwords
- connector secrets
- cross-client data for a client-scoped key
PII and user-level data note
The exposed customer tool set is oriented around aggregated reporting and modeled outputs, not user-level identity data. Typical responses contain channel, campaign, KPI, date, spend, impression, click, forecast, and model-quality fields rather than cookies, device IDs, hashed emails, phone numbers, or person-level event streams. That said, campaign names, category labels, and other customer-defined taxonomy values are client-authored strings. As a best practice, customers should avoid embedding personal or otherwise sensitive information in naming conventions.2. Data schema and data architecture
Access pattern
The MCP server exposes a controlled tool interface over Provalytics data. For a customer-scoped key:- ChatGPT calls the MCP endpoint
- The Prova MCP server validates the API key
- The MCP server resolves the client scope for that key
- The MCP server calls a specific read-only tool
- The tool reads from the approved reporting table, restored model bundle, or static methodology content
- The result is returned to ChatGPT
Tool surface and source architecture
| MCP tool | Data domain | Primary source | Example returned fields |
|---|---|---|---|
get_categories | Category mapping / funnel organization | Provalytics database | category, subcategory, mapped channel/campaign combinations |
get_incrementality | Channel contribution above baseline | Provalytics database | kpi, kpis_available, timeframe, channel, incremental_units, share_pct, spend |
get_recommendations | Optimizer recommendation output | Provalytics database | scenario, forecast_period, total_current_spend, total_recommended_spend, channel, current_spend, recommended_spend, change, change_pct, inc_share_pct, roas |
get_marginal_response | Response curves / efficiency comparison | Provalytics database | channel, current_spend, current_response, roas_per_dollar, curve_points |
get_model_statistics | Model quality summary | Provalytics database | model, observations, r_squared, rmse, mape_pct, optional coefficients |
get_model_predictions | Predicted vs actual time series | Provalytics database | model, dates, actual, predicted, mape_pct |
get_campaign_performance | Channel / campaign hierarchy performance | Provalytics database | date_range, level, channel, spend, incremental_units, impressions, clicks, roas, ctr_pct, cpm |
get_days_to_conversion | Conversion timing by channel | Provalytics database | kpi, as_of_date, timeframe, channel, days_to_conversion, impressions |
get_cpm | CPM and impression-share reporting | Provalytics database | date_range, blended_cpm, total_impressions, total_spend, daily_trend, channel, impression_share_pct, yoy |
get_methodology | Static model explanation | In-code methodology content | methodology sections, summaries, definitions, academic references |
Source-of-truth behavior
The implementation intentionally aligns several MCP tools with the same persistent reporting tables used by the Provalytics dashboard. Examples:get_incrementalityreads from the Provalytics incrementality reporting layerget_campaign_performancereads from the Provalytics campaign-performance reporting layerget_days_to_conversionreads from the Provalytics days-to-conversion reporting layerget_cpmreads from the Provalytics cost and impressions reporting layer
Internal-only separation
The MCP server contains an internal-only tool calledlist_clients, but that tool is restricted to internal-scoped keys and is not available to customer-scoped keys.
For customer ChatGPT integrations, the relevant operating assumption is:
- one user key
- one client scope
- no cross-client listing or traversal
Data minimization characteristics
The data returned to ChatGPT is limited by:- the selected tool
- the parameters supplied to that tool
- the client scope attached to the API key
- report visibility configuration where applicable
3. Systems architecture diagram
Trust boundaries
ChatGPT boundary
ChatGPT acts as the client application invoking MCP tools. It does not connect directly to Provalytics databases.MCP server boundary
The Prova MCP server is the enforcement layer for:- authentication
- client scoping
- tool selection
- rate limiting
- response shaping
Data-source boundary
Approved reporting tables, model bundles, and static methodology content remain inside the Provalytics environment. ChatGPT receives only the response payload produced by the selected tool call.Authentication and authorization controls
API key model
The MCP server accepts credentials via:Authorization: Bearer <token>?token=<token>query parameter
Key format and storage
The implementation validates that keys use theprova_ prefix and performs a SHA-256 hash lookup against the stored key record.
This means the server checks a hashed representation of the key rather than retrieving a stored plaintext key value from the database.
Client scoping
The key record includes:key_idclient_idscopename
client_id, which binds the key to a single client workspace.
Revocation
Revoked keys are rejected by the MCP server and stop working immediately once marked revoked in the key table.Rate limiting
The server enforces a per-key rate limit of60 requests per minute.
Monitoring
The implementation tracks:- request count
- last used timestamp
- per-tool request volume
- session count
- session duration
Transport and interaction model
In transit
The MCP endpoint is served overHTTPS.
The live endpoint supports TLS 1.2 and TLS 1.3. TLS 1.0 and TLS 1.1 are not accepted by the endpoint.
Interaction style
The MCP server supports both:- stateless request/response message handling
- streaming / SSE transport for compatible clients
Data handling characteristics relevant to approval
Read-only design
The customer-exposed MCP tool set is read-only. No tool writes back into Provalytics data stores.No direct database access from ChatGPT
ChatGPT does not authenticate directly against Provalytics internal data stores.No customer-scoped admin surface
Customer keys do not expose internal-only client enumeration or administrative controls.On-demand responses rather than bulk sync
Data is returned only when a tool is invoked. The integration is not built as a background replication process into ChatGPT.Example approval language
If the customer security team wants a concise summary, the following language is accurate:The Provalytics ChatGPT integration uses a client-scoped, read-only MCP server. ChatGPT does not connect directly to Provalytics databases and does not receive a full data export. Instead, ChatGPT makes authenticated, on-demand tool calls to the Prova MCP server, which validates the user key, enforces client scoping, queries approved reporting or model-output sources, and returns only the requested result set. Customer-scoped keys cannot modify data and cannot traverse other clients’ workspaces.
Questions security teams commonly ask
Does ChatGPT receive a full copy of our Provalytics data?
No. The integration is request-driven. ChatGPT receives only the response payload for the specific MCP tool call that was made.Can the connector write back into Provalytics?
No. The current customer tool set is read-only.Can one customer key access another client’s data?
No. Customer keys are client-scoped.Is raw source-system credential material exposed?
No. The customer MCP tool surface is designed around reporting and model outputs, not connector secret retrieval.Are model methodology explanations also available?
Yes.get_methodology returns static methodology content and does not require client-specific reporting data.
Recommended next step
For formal security review, this document should be paired with:- the ChatGPT connector setup guide
- the MCP overview page
- any customer-specific internal policy language around AI usage and retention