SapotaCorp

Migrating from Mule 3 to Mule 4: what actually changes

A Mule 3 to Mule 4 migration is not an upgrade you click through; it is a code rewrite disguised as a version bump. Drawing on a 30-app migration for a banking client, this piece walks through what the rewritten runtime actually changes, where the migration tooling stops helping, and the six places real projects bleed time.

Migrating from Mule 3 to Mule 4: what actually changes

Key takeaways

  • Mule 3 applications do not run on the Mule 4 runtime at all — there is no compatibility shim, so every app has to be migrated as a code change, not deployed as an upgrade.
  • The Mule Migration Assistant reliably auto-converts roughly 60-70% of an app's XML, but it falls down hardest on exactly the legacy patterns old apps lean on: MEL function calls, complex DataWeave 1.0 transforms, exception strategies, and MUnit tests.
  • The deepest changes are conceptual, not syntactic — a self-tuning reactive runtime, a single expression language in DataWeave 2.0, a three-slot event model, and a typed error hierarchy — and migrating without internalizing them produces apps that compile but misbehave under load.
  • Sequence the migration bottom-up by API-Led layer (System, then Process, then Experience), budget about a third of your effort for rewriting tests, and load-test before cutover to catch the performance regressions reactive scheduling can hide.

A while back I picked up a migration that looked, on paper, like a checkbox exercise: take a portfolio of around thirty MuleSoft applications built for a bank back in 2016-2017 and move them off Mule 3 before extended support ran out. Management framed it as "upgrade the runtime." The first thing I had to do was disabuse everyone of that framing, because Mule 3 apps do not run on the Mule 4 runtime. Not with a flag, not with a shim, not with a compatibility mode. The core was rewritten, and what you have is a translation project, not an upgrade.

That distinction matters because it changes how you estimate, how you staff, and how you sequence the work. These were not greenfield flows. They were full of Mule Expression Language, hand-rolled Java components, and the kind of exception strategies that accumulate when an integration layer has been in production for the better part of a decade. The naive plan — point the tooling at the repo and ship the output — would have produced thirty apps that compiled and then quietly misbehaved in production.

So before writing a single line of migrated XML, I spent time understanding what Mule 4 actually changed under the hood, because almost every migration gotcha traces back to one of those architectural decisions.

The runtime got opinionated so you don't have to be

Mule 3 was a product of the SOA era. Its kernel was built on a staged event-driven architecture, which is a polite way of saying every flow had its own queue and thread pool that you, the developer, were expected to tune. I have seen the failure mode this invites firsthand: one customer-enquiry flow on the old platform had its maxThreadsActive cranked to 200, and when the morning peak hit around 10am it happily ate the entire heap. The config knob that was supposed to give you control gave you a foot-gun.

Mule 4 takes that away. The runtime self-tunes against three global thread pools — CPU_LITE for short non-blocking work like routing and transformation, CPU_INTENSIVE for heavy compute such as large DataWeave, and BLOCKING_IO for database, file, and legacy HTTP calls. The runtime decides which pool a task belongs to based on connector metadata, and you declare nothing. Underneath it all sits a reactive, non-blocking engine built on Project Reactor, so a listener that would have pinned one thread per request on Mule 3 now serves an order of magnitude more connections with a fraction of the threads, because threads waiting on I/O get handed back to the pool instead of blocking.

The catch is the one that bit us later: if a legacy app has a custom Java component that blocks, and you port it without telling the runtime, it lands on CPU_LITE and starves the pool that everything else needs. The fix is to mark blocking invocations to run on BLOCKING_IO, but you only know to do that if you understand the model. The tooling will not warn you.

One expression language, one event, one error model

Three of the most disruptive changes are about consolidation, and they are where most of the manual rewrite hours go.

Mule 3 shipped two expression languages living side by side: MEL for inline expressions and DataWeave 1.0 confined to transform blocks. Mule 4 deletes MEL entirely and makes DataWeave 2.0 the language everywhere — in routers, in variable assignments, in connector parameters. That sounds clean until you count. The portfolio I was working with had on the order of two thousand MEL expressions, and every one of them needed a home in 2.0. Property access like #[message.inboundProperties['http.method']] becomes #[attributes.method], which is genuinely nicer, but the function calls are where it hurts.

The event model collapsed too. Mule 3's message had four regions — payload, read-only inbound properties, writable outbound properties, and invocation (flow) variables — and the perennial bug was that an outbound property set in flow A would leak in as an inbound property of flow B. Mule 4 cuts this to three: payload, read-only attributes tied to the source connector, and writable variables. Outbound headers no longer exist as a floating concept; you set them directly on the request connector. Concretely, a Mule 3 pattern of setting an Authorization outbound property before an outbound HTTP endpoint becomes an inline header on the <http:request> element. Every such site has to be found and rewritten by hand.

Error handling moved from exception strategies to a try-scope-plus-error-handler model that reads like Java try/catch. More importantly, Mule 4 introduced a typed error hierarchy — MULE:ANY at the top, then connector-specific types like HTTP:CONNECTIVITY and DB:CONNECTIVITY beneath it. On Mule 3 you caught a java.sql.SQLException; on Mule 4 you catch DB:CONNECTIVITY. That is a better abstraction, but it means you cannot mechanically map old catch blocks — you have to reason about which Mule error types correspond to the Java exceptions the old code was guarding against, and make sure your handlers actually cover them.

The tooling gets you most of the way, then stops

MuleSoft's Mule Migration Assistant (MMA) is open source and genuinely useful. You point it at a Mule 3 project, give it a target like 4.6.0, and it emits a Mule 4 project skeleton plus an HTML report that flags what converted, what needs review, and what it could not touch. It even injects <!--migrationmessage--> annotations at the spots that need a human.

Here is the honest accuracy picture from real apps. Flow structure converts at around 90%. Common connectors land at 70-80%. DataWeave 1.0 to 2.0 sits at 60-70% — simple syntax is fine, complex transforms break. MEL to DataWeave drops to 40-50%, because basic property access converts but function calls do not. Exception strategies come in around 50-60%, and MUnit tests around 50%. Custom Java components are essentially untouched at roughly 10% — the tool wraps them, but the logic is yours to verify. Average it out and you get 60-70% of the XML converted automatically. That remaining 30-40% is where the entire schedule lives.

The trap is treating the 60-70% as "almost done." It is not. The tool transforms code; it never runs the app. After MMA you still have to build, fix the compile errors, rewrite what broke, and test the whole thing as if it were new — because, functionally, it is.

The six places projects actually bleed time

After doing this across the portfolio, the same problems showed up repeatedly. MEL function calls top the list: things like StringUtils.upperCase or any static Java call get left as-is and break the build, so I ended up grepping the entire project for #[.*\..*\(.*\)] to flush them out — one app alone had forty-seven. Second, DataWeave syntax breakage: %output, as :string, and using blocks do not exist in 2.0, and on top of the obvious failures we found that around 12% of transforms produced slightly different floating-point output because 2.0's default precision differs, which you only catch by diffing old and new output byte-for-byte against sample inputs.

Third, connector version drift — the Mule 4 connector with the same name often has different operation signatures (the Salesforce connector, for instance, changed its pagination model from offset to cursor between the Mule 3 and Mule 4 generations), so you read release notes per connector and re-test each operation. Fourth, MUnit: the 1.x tests simply do not run on Mule 4, and if you let coverage slide from 70% to 40% while you scramble to rewrite, your ops team rightly refuses to deploy. Budget a real chunk of effort, roughly a third, for tests and do not cut them to look fast. Fifth, performance regressions — Mule 4 is faster on I/O-bound work but can be slower on CPU-intensive DataWeave or, worse, when a blocking Java component lands on the wrong pool, so load-test before cutover. Sixth, the property and secrets format changed from .properties to YAML with a separate secure-properties module, which means re-encrypting secrets and re-testing deployment.

Sequence it like the architecture, not the calendar

The one planning decision that paid off most was migrating in API-Led order: System APIs first, then Process, then Experience. Experience APIs depend on Process APIs, which depend on System APIs, so migrating top-down just blocks yourself. We split the thirty apps roughly into System, Process, and Experience tiers and moved them over about six months with a small team, starting with a proof-of-concept on one simple System API to calibrate real effort before committing the rest. The Experience layer cuts over last, with a brief parallel run of old and new before flipping the load balancer.

If there is a single principle to carry out of this, it is to respect that Mule 4 is a rewrite wearing a version number. The architecture changes — self-tuning reactive runtime, one expression language, a simplified event, a typed error model — are not trivia; they are the reason the tooling can only take you partway and the reason a migrated app can compile cleanly and still fall over in production. Treat it as new development with a generous head start, test it like new development, and sequence it bottom-up, and the legacy estate comes across intact.


Building or operating MuleSoft integrations? Our Salesforce team designs API-led architectures, builds Mule flows, and runs them in production. Get in touch ->

See our full platform services for the stack we cover.

Engineering certifications

Sapota engineers hold credentials on MuleSoft. Each badge links to the individual engineer's credly profile.

Browse MuleSoft certs

Need this on your team?

Sapota engineers ship the patterns you read here. Two-week paid trial, direct pricing from $1,800/ engineer/month, no agency markup.

Get a quote
Contact Us Now

Share Your Story

We build trust by delivering what we promise – the first time and every time!

We'd love to hear your vision. Our IT experts will reach out to you during business hours to discuss making it happen.

WHY CHOOSE US

"Collaborate, Elevate, Celebrate where Associates - Create Project Excellence"

SapotaCorp beyond the IT industry standard, we are

  • Certificated
  • Assured quality
  • Extra maintenance

Tell us about your project