Plugin + Azure Function + Service Bus: async integration at scale

A Dataverse row change has to propagate to an ERP, a search index, and a notification service. 6 months of us shipping exactly this pattern on enterprise projects, with the architecture, code, and failure-handling decisions that make it reliable at ten thousand events per hour.

Published May 06, 2026

Plugin + Azure Function + Service Bus: async integration at scale

Key takeaways

Enterprise Dataverse integration pattern: Plugin publishes to Service Bus, Azure Function consumes and routes downstream. The 3-layer chain handles 10K events per hour with retry, dead-letter, and observability — the components are standard, the wiring is where most implementations break.
Plugin publishes minimal data, not the full record. The plugin sends the entity ID and operation type; the Azure Function reads the full record from Dataverse via API. The pattern decouples the plugin from data shape changes and keeps the Service Bus message size manageable.
Retry happens at every layer. Service Bus retries deliveries; Azure Function retries downstream calls. Each layer's retry policy needs explicit configuration — backoff strategy, max attempts, dead-letter routing on exhaustion. Defaults are wrong for most integrations.
Idempotency at the consumer is non-optional. The same Dataverse change can deliver to the downstream system multiple times through retry. The consumer must dedupe on a stable key (message ID, idempotency token) to prevent duplicate effects (double-charge, double-create, double-notify).

A Dataverse row changes. Five downstream systems need to know: an ERP that tracks financials, a search service that indexes the record, a Power BI dataset that feeds executive dashboards, a notification queue that messages field reps on mobile, and a data lake for analytics retention.

The junior version of this is a single async plugin that calls all five from the same post-operation step. It works in development. In production, it fails the first time any one of the five has a bad afternoon - the plugin step errors, retries 10 times, fills the System Jobs queue, and operators get paged.

The pattern we ship at enterprise scale is different: the plugin's only job is to publish a message to Azure Service Bus. Five independent consumers pull from that message topic and handle their respective destinations. Each consumer retries, dead-letters, and monitors independently. The plugin never calls out to anything but the queue.

Here is the architecture, the code, the failure semantics, and the 6 months of real-production experience that shaped it.

The architecture

Dataverse row change
  ↓
Post-operation async plugin (sends message to Service Bus Topic)
  ↓
Azure Service Bus Topic: "dataverse-changes"
  ├─ Subscription: erp        → Azure Function → ERP API
  ├─ Subscription: search     → Azure Function → Search indexer
  ├─ Subscription: powerbi    → Azure Function → Power BI dataset refresh
  ├─ Subscription: notify     → Azure Function → Notification service
  └─ Subscription: datalake   → Azure Function → ADLS Gen2 writer

Each subscription has:
  - Its own filter (some subscribers only care about certain event types)
  - Its own dead-letter queue
  - Its own consumer scaling rules

A Topic (not a Queue) because the same event feeds multiple subscribers. Subscriptions filter so each consumer only sees events it cares about. Dead-letter per subscription, not global - the ERP and search services can fail independently without cross-contaminating.

Stage 1: the plugin

The plugin is small on purpose. Every line is a potential failure point; the less it does, the more reliable it is.

public class PublishDataverseChangePlugin : IPlugin
{
    public void Execute(IServiceProvider serviceProvider)
    {
        var context = (IPluginExecutionContext)serviceProvider.GetService(typeof(IPluginExecutionContext));
        if (context.MessageName != "Update" && context.MessageName != "Create") return;
        if (!(context.InputParameters["Target"] is Entity target)) return;

        var payload = new
        {
            MessageId = Guid.NewGuid().ToString(),
            EntityName = target.LogicalName,
            EntityId = target.Id.ToString("D"),
            Operation = context.MessageName,
            ChangedAttributes = target.Attributes.Keys.ToList(),
            ChangedAt = DateTime.UtcNow,
            UserId = context.UserId.ToString("D"),
            CorrelationId = context.CorrelationId.ToString("D")
        };

        var json = JsonSerializer.Serialize(payload);
        var config = GetConfig(serviceProvider); // Service Bus connection string from secure config
        using var client = new ServiceBusClient(config.ConnectionString);
        var sender = client.CreateSender(config.TopicName);
        var message = new ServiceBusMessage(json)
        {
            MessageId = payload.MessageId,
            ContentType = "application/json",
            Subject = $"{target.LogicalName}.{context.MessageName}"
        };
        sender.SendMessageAsync(message).GetAwaiter().GetResult();
    }
}

Three key decisions:

Payload is metadata, not the full row. The message includes the entity name, ID, and which attributes changed. Consumers fetch the full row if they need it. Messages stay small (sub-1KB), which Service Bus charges less for and which avoids the edge case of payloads exceeding message size limits.
CorrelationId from the plugin context. Every log entry and every downstream message carries the same correlation ID, making cross-system tracing possible with one query.
Subject field for filtering. Subscribers filter on Subject LIKE 'account.%' or Subject = 'order.Update'. Subject-based filtering is cheap server-side; body-based filtering is more expensive.

Stage 2: Service Bus Topic configuration

Topic-level settings:

Max message size: 1MB (default). Our payloads are ~500 bytes, nowhere near the limit.
Message time-to-live: 7 days. Long enough to survive a weekend outage, short enough that stale messages don't haunt the system.
Duplicate detection: enabled with 10-minute window. If the same MessageId arrives twice within 10 minutes, the second is dropped. This guards against plugin retries causing duplicate downstream processing.

Subscription-level settings per subscriber:

Filter rule: SQL-like expression on message properties (Subject, custom headers).
Max delivery count: 5 (same reasoning as in the simpler Service Bus pattern).
Lock duration: tuned to each consumer's processing time. The ERP consumer, which makes a remote call taking up to 30 seconds, has a 60-second lock. The search consumer, which is purely in-Azure and takes 1-2 seconds, has a 30-second lock.

Stage 3: the consumers

Each subscriber is an Azure Function with a Service Bus Topic subscription trigger. They share a common frame but implement different downstream logic.

Common frame:

public async Task ProcessMessage(ServiceBusReceivedMessage message, ServiceBusMessageActions actions, CancellationToken ct)
{
    var correlationId = message.ApplicationProperties["CorrelationId"] as string ?? Guid.NewGuid().ToString();
    using var _ = _logger.BeginScope(new { CorrelationId = correlationId, MessageId = message.MessageId });

    try
    {
        var payload = JsonSerializer.Deserialize<DataverseChangePayload>(message.Body.ToString());
        
        // Idempotency check - have we already processed this MessageId?
        if (await _idempotencyStore.IsProcessedAsync(message.MessageId, ct))
        {
            _logger.LogInformation("Skipping already-processed message");
            await actions.CompleteMessageAsync(message, ct);
            return;
        }

        await ProcessCoreLogic(payload, ct);

        await _idempotencyStore.MarkProcessedAsync(message.MessageId, ct);
        await actions.CompleteMessageAsync(message, ct);
    }
    catch (PermanentFailureException ex)
    {
        _logger.LogError(ex, "Permanent failure");
        await actions.DeadLetterMessageAsync(message, new Dictionary<string, object>
        {
            ["Reason"] = "PermanentFailure",
            ["ExceptionType"] = ex.GetType().Name,
            ["Detail"] = ex.Message
        }, ct);
    }
    // Transient failures fall through - lock expires - message re-delivered
}

The idempotency store is either Cosmos DB with TTL or Redis. Every successfully processed MessageId is recorded. The TTL matches the message time-to-live on the Topic plus a safety margin.

Core logic per consumer is where the differences live:

ERP consumer: fetches the full Dataverse row via Web API (using a managed identity), maps to the ERP's schema, calls the ERP API with an idempotency key derived from MessageId.
Search consumer: fetches the row, calls the search service's indexing API.
Power BI consumer: triggers a dataset partition refresh; uses the correlation ID to tag the refresh operation.
Notification consumer: looks up which users to notify based on the row's owner and policy, sends via the notification service.
Data lake consumer: appends a compressed JSON record to partitioned ADLS Gen2 path by date.

Observability

The chain is not useful in production without observability. Our setup:

Application Insights per Function App, with shared workspace so queries span functions.
Every log entry includes CorrelationId and MessageId via the BeginScope pattern above.
Custom metrics: ProcessedMessages, FailedMessages, DeadLetteredMessages, ProcessingDuration.
Dashboards (Azure Monitor workbooks):
- End-to-end latency per event type (from Dataverse change timestamp to final destination write).
- Dead-letter queue depth per consumer.
- Error rate per consumer, per hour.
Alerts: DLQ depth > 10 for any consumer, processing duration p95 > SLA for any consumer.

The end-to-end trace for a single event: a KQL query across Application Insights joining by CorrelationId. 1 row change in Dataverse, traced through plugin → Service Bus publish → each consumer → each downstream call. When something fails, we know exactly where in the chain.

6 months in production: the numbers

A client with:

50,000 Dataverse events per day
5 consumers averaging 3 seconds processing each
Bursts to 500 events/minute during peak hours

Current state:

P50 end-to-end latency (event → all consumers complete): 4.2 seconds
P95 end-to-end latency: 11 seconds
Dead-letter rate: 0.003% (one in 33,000 messages)
Consumer error rate: 0.1-0.2% (mostly transient, auto-retried)
Monthly Azure cost (Service Bus + Function Apps + Application Insights): ~$320

What we tuned after launch

Ratio of Function App instances to queue depth. Consumption plan auto-scales; the defaults were conservative. We adjusted maxConcurrentCalls per function to keep consumer processing near its per-instance capacity before spinning up more instances.

Filter rules on subscriptions. Initially each consumer received every event and filtered in code. Moving the filter to the subscription level (server-side) cut consumer costs by ~60% because consumers no longer woke up for irrelevant events.

Deduplication window. Initial 1-minute window was too short; plugin retries occasionally pushed duplicates more than a minute apart. Extended to 10 minutes after measuring actual retry intervals.

When not to ship this

This pattern is overkill for:

Projects with one or two downstream consumers - direct Power Automate flow is simpler.
Low-volume scenarios (< 1000 events/day) - the operational complexity outweighs the throughput benefit.
Teams without Azure experience - maintaining Service Bus, Function Apps, and Application Insights is real ops burden. Pick a simpler pattern until the operational muscle exists.

For the projects where it fits (enterprise-scale, multi-consumer, event-critical), this is the architecture we reach for first. It survives load, it surfaces failures cleanly, and when something goes wrong, the fix is almost always in a known location with clear visibility.

Dante Hoàng

Microsoft Apps + Internal Tools Engineer

Share Your Story

We build trust by delivering what we promise – the first time and every time!

We'd love to hear your vision. Our IT experts will reach out to you during business hours to discuss making it happen.

WHY CHOOSE US

"Collaborate, Elevate, Celebrate where Associates - Create Project Excellence"

SapotaCorp beyond the IT industry standard, we are

Certificated
Assured quality
Extra maintenance

Plugin + Azure Function + Service Bus: async integration at scale

Key takeaways

The architecture

Stage 1: the plugin

Stage 2: Service Bus Topic configuration

Stage 3: the consumers

Observability

6 months in production: the numbers

What we tuned after launch

When not to ship this

Dante Hoàng

Share Your Story

Contact Us

Email

WhatsApp

Office

WHY CHOOSE US

Tell us about your project

Contacts

Company

Services

contacts

Plugin + Azure Function + Service Bus: async integration at scale

Key takeaways

The architecture

Stage 1: the plugin

Stage 2: Service Bus Topic configuration

Stage 3: the consumers

Observability

6 months in production: the numbers

What we tuned after launch

When not to ship this

Dante Hoàng

More from Power Platform

Power Platform requirements: the non-functional ones decide the architecture

Power Apps for a driving school: weekly to yearly KPI reports

Power BI embed in Power Apps: row-level security that works

Plugin logging in Dataverse: ILogger + Application Insights

Business Process Flows in Dataverse: when they help vs lock in

Model-driven ribbon: Command Designer vs Ribbon Workbench in 2026

Share Your Story

Contact Us

Email

WhatsApp

Office

WHY CHOOSE US

Tell us about your project

contacts