So, you've heard the line. MongoDB has transactions now.
And probably, the very next thing that happened on your team is somebody wrapped a whole request handler in session.startTransaction() because, hey, why not be safe.
That instinct comes from the right place. Partial writes ruin people's evenings. The order got created, payment was charged, the inventory line never decremented, and now finance is asking you to explain a row that shouldn't exist. Once you've lived through that, "wrap everything in a transaction" sounds like the right answer.
The problem is that MongoDB transactions are not a free safety net. They have a real cost: latency, cache pressure, and the size of your error-handling code. Most of the writes you're worried about don't actually need them. The whole document model was designed around not needing them most of the time.
Let's break it down.
What MongoDB Was Already Atomic About
Before transactions, MongoDB wasn't a free-for-all. A single-document write has always been atomic. That's the foundational guarantee of the engine, and it does more work than people remember.
When you run this:
await db.collection("users").updateOne(
{ _id: userId },
{
$set: { "profile.timezone": "Europe/London" },
$inc: { "stats.loginCount": 1 },
$push: { "audit": { at: new Date(), event: "login" } }
}
);
That update changes three different parts of the document (a nested field, a counter, an array) and the database treats it as one indivisible operation. Either all three changes land together, or none of them do. No reader will see a state where the timezone has been updated but the audit array hasn't.
Now look at what that gives you when you've embedded well. An order with its line items inside it. A user with their preferences inside them. A blog post with its (bounded) comment thread inside it. Updating the parent and any of its embedded children is one atomic write. No transaction needed. No BEGIN, no commit, no rollback machinery, no extra round trip.
This is the part of MongoDB that was always quietly doing transactional work for you. The document is the unit of consistency. If you can fit a business invariant inside a single document, you've already solved the problem.
A lot of "I need a transaction" instincts melt away once you ask: can this whole thing live in one document?
When You Actually Need A Multi-Document Transaction
Embedding doesn't always work. Sometimes the invariant lives across two documents that genuinely shouldn't be merged into one. That's the place transactions earn their keep.
Three patterns where you really do need them:
1. Cross-collection writes that must succeed or fail together. You're moving money between two accounts, each in its own document. Debit one, credit the other. If only the debit lands, you've just lost a customer's money. There's no clever embedding trick. Accounts are independent entities with their own lifecycle. The two writes have to commit as a unit.
await session.withTransaction(async () => {
await accounts.updateOne(
{ _id: fromId, balance: { $gte: amount } },
{ $inc: { balance: -amount } },
{ session }
);
await accounts.updateOne(
{ _id: toId },
{ $inc: { balance: amount } },
{ session }
);
await ledger.insertOne(
{ fromId, toId, amount, at: new Date() },
{ session }
);
});
2. Multi-document state changes where the order isn't safe to leave half-done.
You're closing a sprint: archiving the sprint document, marking its tasks as done, snapshotting the final velocity into a reports collection. If the sprint gets archived but the tasks don't get updated, you have ghost work-items pointing at a sprint that no longer exists.
3. Unique cross-document invariants that can't be expressed with a single index. One user can't own more than three projects in the free tier. Inserting a project and bumping the user's project counter need to be one atomic decision. You could do it with a clever upsert, but a transaction makes the invariant explicit and obvious to the next engineer who reads the code.
What these have in common: the writes touch more than one document, and there's no clean way to embed them into one. That's the test. If you can fold the data into a single document and use atomic operators, do that. If you genuinely can't, you have a transaction.
What doesn't pass the test (and is the most common false positive) is a single-document update that the developer wrapped in a transaction "to be safe." That's pure overhead with no benefit, because the engine was already going to give you atomicity for free.
What A MongoDB Transaction Looks Like In Code
The shape is the same in every driver: open a session, run your operations under that session, commit if everything worked, abort if anything failed. The driver helper withTransaction wraps all of that, including the retry loop you'd otherwise hand-roll.
Here's the Node.js version, end to end:
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGO_URL);
await client.connect();
const db = client.db("bank");
const accounts = db.collection("accounts");
const ledger = db.collection("ledger");
async function transfer(fromId, toId, amount) {
const session = client.startSession();
try {
await session.withTransaction(async () => {
const debit = await accounts.updateOne(
{ _id: fromId, balance: { $gte: amount } },
{ $inc: { balance: -amount } },
{ session }
);
if (debit.matchedCount === 0) {
// Insufficient funds. Throwing aborts the transaction.
throw new Error("INSUFFICIENT_FUNDS");
}
await accounts.updateOne(
{ _id: toId },
{ $inc: { balance: amount } },
{ session }
);
await ledger.insertOne(
{ fromId, toId, amount, at: new Date() },
{ session }
);
}, {
readConcern: { level: "snapshot" },
writeConcern: { w: "majority" }
});
} finally {
await session.endSession();
}
}
A few things that aren't obvious from a quick reading:
- Every operation inside the block has to pass
{ session }. If you forget on one line, that line runs outside the transaction and won't roll back. This is the most common bug in transaction code, and the driver won't warn you. - Throwing inside the callback aborts. No need to call
abortTransactionmanually:withTransactioncatches the throw, aborts cleanly, and rethrows. withTransactionretries automatically on errors labeledTransientTransactionErrororUnknownTransactionCommitResult. That's why your callback must be idempotent: it might run twice. More on that in a moment.- Replica set required. Transactions don't work against a standalone
mongod. Even if your local dev stack is "just one node," it has to be configured as a single-member replica set. You'll find this out the hard way the first time CI fails.
The Python version is structurally identical, just with with blocks instead of arrow functions:
from pymongo import MongoClient, WriteConcern, ReadConcern
client = MongoClient(MONGO_URL)
db = client["bank"]
def transfer(from_id, to_id, amount):
def _txn(session):
debit = db.accounts.update_one(
{"_id": from_id, "balance": {"$gte": amount}},
{"$inc": {"balance": -amount}},
session=session,
)
if debit.matched_count == 0:
raise RuntimeError("INSUFFICIENT_FUNDS")
db.accounts.update_one(
{"_id": to_id},
{"$inc": {"balance": amount}},
session=session,
)
db.ledger.insert_one(
{"from": from_id, "to": to_id, "amount": amount},
session=session,
)
with client.start_session() as session:
session.with_transaction(
_txn,
read_concern=ReadConcern("snapshot"),
write_concern=WriteConcern("majority"),
)
That's the entire transactional API surface. There isn't a deep magic to it. The interesting parts aren't in the syntax. They're in the costs you pick up the moment you call startSession.
The Performance Cost Nobody Mentions Up Front
In a relational database, transactions feel native because the whole engine was built around them. In MongoDB, transactions are a layer on top of an engine that was originally designed around single-document atomicity. They work, they're correct, but they cost more than newcomers expect, and the cost shows up in places you didn't think to monitor.
Here's what's actually happening when you open a transaction:
A snapshot is taken. MongoDB transactions provide snapshot isolation by default. The session sees the database as it looked at the moment the transaction started, frozen, even if other writes commit while you're working. That snapshot has to live somewhere: the WiredTiger cache.
Locks are held longer. Outside a transaction, the engine grabs a document-level lock, applies your write, releases. Inside a transaction, locks acquired by your operations have to be held until commit or abort. The longer the transaction runs, the longer everyone else waits to update the same documents.
Cache pressure climbs. That snapshot, plus the in-progress writes, plus the undo information needed to roll back if you abort, all live in cache. A handful of long transactions can push out hot pages that other queries depended on, and suddenly your unrelated read latency goes up for a reason your dashboards don't explain.
There's a hard 60-second ceiling by default. The cluster setting
transactionLifetimeLimitSecondsdefaults to 60. A transaction that exceeds it gets aborted by the server, with aTransientTransactionErrorthat triggerswithTransactionto retry the whole thing, possibly into the same wall. If your callback talks to anything external (an API, a slow file read, a spin lock you didn't realise was there), the budget is gone before you noticed.Write conflicts cause retries. If two transactions touch the same document and try to commit at conflicting versions, one of them gets a
WriteConflicterror. The driver retries. Under contention, you can get retry storms, where work gets done two or three times before one of them wins. The throughput loss is invisible until you graph p99 latency for that endpoint.The callback must be idempotent. Because of those retries, anything you do inside the transaction needs to be safe to do twice. That includes not just the database writes (which the engine handles) but anything else you've smuggled in: appending to an in-memory list, generating a UUID and using it later, writing to a log. If you've done any of those, your retry path is now a bug factory.
None of this is a reason not to use transactions. It's a reason not to use them casually. When you reach for a transaction, you should be able to answer "why does this need one?" with a real cross-document invariant, not a vague "it felt safer."
Embedding And Idempotency: The Two Cheaper Tools
Most of the writes that look like transaction territory aren't, once you remember the two cheaper tools the document model gives you.
Embedding collapses what would have been a multi-document update into a single-document update, which is atomic for free. The classic version of this:
// orders collection
{ _id: orderId, customerId, status: "pending" }
// orderItems collection (one row per line)
{ _id: itemId, orderId, sku, quantity, price }
That shape forces you into a transaction every time you change an order and its items together: a sprint full of "wrap it in a transaction" tickets, ten queries per checkout, 500 lines of error handling.
Now the same data, embedded:
{
_id: orderId,
customerId,
status: "pending",
items: [
{ sku: "BK-001", quantity: 1, price: 32.00 },
{ sku: "MUG-13", quantity: 2, price: 12.50 }
],
totals: { subtotal: 57.00, total: 61.99 }
}
Adding an item, changing a quantity, marking the order paid: all of those are now one atomic update on one document. No session, no commit, no retry loop. The decision tree gets simpler too: you stop debating "should this part be a transaction?" on every PR.
This isn't a license to embed everything. The schema-design rules still apply: don't embed an unbounded array, don't embed children that have their own life and their own queries. But where embedding is already correct on schema grounds, you also get atomicity for free. That's a double win you should take.
Idempotency is the other tool. A surprising number of "transactional" workflows aren't actually atomic. They're eventually consistent, and the only reason a transaction looks like the answer is that the developer hasn't sat down and made each step safe to repeat.
Take a sign-up flow:
- Create the user.
- Create their default workspace.
- Send a welcome email.
Wrapping the first two in a transaction is fine and costs little. Putting the email send inside the transaction is a mistake: the network call burns your time budget, and "abort" can't unsend the email anyway. The right shape is: write to the database, commit, then enqueue a job that sends the email. The job has its own retry semantics. If the email job fails, it retries. If it runs twice, the recipient gets two emails, annoying but recoverable. If it never runs, your monitoring for the queue catches it.
The general pattern: split the workflow into steps that are individually retryable. Each step's effect is keyed by something stable (a request id, a user id, a job id). When a step retries, it checks "have I already done this for this key?" before doing the work. The overall system tolerates failures and partial completions without needing a single big atomic envelope around everything.
await jobs.updateOne(
{ _id: jobId, status: "pending" },
{ $set: { status: "running", startedAt: new Date() } }
);
// do the work...
await jobs.updateOne(
{ _id: jobId, status: "running" },
{ $set: { status: "done", finishedAt: new Date() } }
);
That little update-with-status-guard pattern, applied across a workflow, gives you something better than a transaction in many cases: a process that can crash, restart, run twice, run on three nodes simultaneously, and still converge to a correct state. Transactions can't do that. They give you all-or-nothing within a process, but they don't help you when the process itself dies between commit and the next step.
A useful gut check: if your workflow has any step that talks to a third party (a payment gateway, an email provider, an external HTTP API), a transaction can't save you. The third party doesn't roll back. Idempotency is the only thing that does.
So, Need Or Not?
Reach for a transaction when there's a real cross-document invariant that can't be embedded into one document and can't be salvaged by retrying steps. That's a smaller set than it looks like at first.
For everything else (the single-document updates the engine was already handling, the multi-step flows that work better as idempotent jobs, the parent-child relationships that should have been embedded all along), there's a cheaper, faster, more resilient tool.
MongoDB transactions are correct. They're also expensive. Pick them on purpose.





