A client wanted their billing system to push invoice updates into Dataverse. Their invoice numbers were the canonical identifier on their side - six digits, globally unique, never reused. Our instinct was correct: use the invoice number as an alternate key so the integration can upsert by it, skipping the round-trip lookup to find the Dataverse GUID for each invoice.
The implementation took an afternoon. The debugging of the three edge cases it introduced took two weeks, spread across three separate incidents. Here is what alternate keys do, when they are genuinely the right answer, and the traps we now look for before enabling one.
What alternate keys actually do
Every Dataverse row has a primary key - a system-generated GUID. The GUID is stable, unique, and meaningless to anything outside Dataverse. For integrations with external systems that have their own identifier ("our CRM customer number is ACC-12345"), you have two options:
- Look up the Dataverse GUID before every write: GET /accounts?$filter=acme_customernumber eq 'ACC-12345' then PATCH /accounts({guid}).
- Define acme_customernumber as an alternate key and write directly: PATCH /accounts(acme_customernumber='ACC-12345').
Option 2 halves the API calls and is atomic - no window between the read and the write where another process might create a duplicate. For high-volume integrations this is a meaningful performance and correctness improvement.
Dataverse enforces uniqueness on alternate key columns at the platform level, backed by a unique index. Two rows with the same alternate key value is impossible - the second insert fails.
When alternate keys are the right answer
The pattern we reach for alternate keys on:
- External system is the system of record for the identifier. The CRM pushes account numbers to Dataverse, never the reverse.
- Identifier is stable. The external system does not reuse account numbers after an account is deleted, and does not renumber on migrations.
- Integration is high volume. A nightly batch of 10,000 upserts benefits from alternate key syntax; a weekly five-record sync does not.
- You want platform-enforced uniqueness as a data quality guarantee, not just a "should be unique" business rule.
When all four are true, alternate keys are a straightforward win.
Gotcha #1: null values in the alternate key column
You enable acme_customernumber as an alternate key on the account table. Dataverse accepts the change. A week later, the integration fails with a uniqueness violation.
Root cause: there were four existing account rows with acme_customernumber null (data from before the external system was fully wired up). Dataverse treats null values in a unique index as distinct - so null is allowed to appear multiple times. But if a new row is inserted with null for that column, any future upsert that uses null as the key hits ambiguity.
More subtly: the alternate key creation succeeds on tables that contain null values in the key column. Dataverse does not block you. It only fails when the integration behavior exposes the problem.
Our fix is a two-step policy:
- Before enabling an alternate key, run a pre-check query to find null values in that column.
- Either backfill them with a unique sentinel value, or make the column required (requiredLevel: systemrequired) and default new rows to a placeholder the integration later overwrites.
Gotcha #2: case sensitivity
The external CRM uses lowercase account numbers in its database. The UI shows them uppercase. Users who hand-enter data in Dataverse type them uppercase. The integration, which reads the database directly, writes them lowercase.
An account arrives from the integration as acc-12345. A user later creates an account manually with customer number ACC-12345, assuming they are the same record. Dataverse - because the alternate key uniqueness index is case-sensitive in some collation settings but case-insensitive in others, and the default depends on your environment's regional settings - accepts both. You now have two records for the same real-world account.
The fix we enforce: normalize the alternate key value in every write path. The integration lowercases (or uppercases, pick one) before writing. A plugin on pre-create of the parent table mirrors the same normalization for manually created rows. Tests verify both paths produce the same stored value.
The broader lesson: the platform enforces the unique index as it sees it, which may not match how humans see it.
Gotcha #3: alternate key on a lookup column
A client wanted to make acme_primary_contact an alternate key on the account table. The logic: each account has exactly one primary contact, and they wanted to enforce uniqueness.
Dataverse lets you define alternate keys on lookup columns, but the stored value in the index is the target's primary key, not any business identifier. Uniqueness is enforced on "no two accounts can reference the same contact row as primary", which was actually the intent.
But consider the integration implications: you cannot upsert by "the primary contact's email address" because the alternate key stores contact GUIDs, not email. And because lookup alternate keys are enforced at the GUID level, they become brittle when the target row gets deleted and recreated (new GUID, same business meaning - uniqueness check now passes for a row that should have failed).
Our pattern is: alternate keys on lookup columns are fine for enforcing uniqueness of the relationship, never for enabling upsert by an external identifier of the target. If you need the latter, denormalize the target's business identifier onto the parent table and put the alternate key on that string column.
When alternate keys are the wrong answer
Three cases we have walked teams back from:
- "We might want to upsert by this eventually." Adding an alternate key for a future maybe creates a unique index you have to maintain. Remove the temptation until you have a concrete integration that needs it.
- Composite alternate keys with optional parts. Dataverse supports up to three columns per alternate key. If any of those columns can be null, you inherit gotcha #1 at compound level - much harder to reason about.
- Columns that users edit freely. An alternate key prevents duplicate values. If a user needs to temporarily hold two rows with the same value while they reconcile them, the platform blocks them. Operationally painful.
The review we run before enabling one
Every alternate key request now runs a three-item check before we turn it on:
- What is the integration use case? (Must be concrete, must be a write pattern, not a read.)
- Is the column required and null-free on all existing rows?
- Is there a normalization rule, and where is it enforced? (Every write path, not just one.)
If any item is weak, we park the alternate key until the gaps are closed. This has not once blocked a legitimate integration; it has three times prevented a production incident.