When I first started writing Batch Apex, I treated it like a glorified for-loop. Throw everything in execute(), call it a day. That approach works fine until you’re dealing with 5 million records, three external API callouts per record, and a finish deadline of 6 AM before the business opens. Then the wheels come off fast.
Over the years I’ve built batch jobs that process insurance policy renewals, sync order data to ERP systems, and archive multi-terabyte log tables. These are the patterns I keep reaching for.
Understanding the Batch Lifecycle
Before diving into patterns, it helps to have the lifecycle clearly in mind.
The key insight is that governor limits reset between every execute() call. Each chunk of up to 200 records is an independent transaction. This is both the power and the constraint of Batch Apex.
Pattern 1: The Stateful Aggregator
The Database.Stateful interface lets you persist instance variables across execute() calls. I use this sparingly because it increases heap pressure — but it’s the right tool when you need to aggregate results across the entire job.
public class AccountSyncBatch implements Database.Batchable<SObject>, Database.Stateful {
private Integer successCount = 0;
private Integer failureCount = 0;
private List<String> errorMessages = new List<String>();
public Database.QueryLocator start(Database.BatchableContext bc) {
return Database.getQueryLocator(
'SELECT Id, Name, AnnualRevenue, External_Id__c ' +
'FROM Account WHERE External_Id__c != null AND Sync_Required__c = true'
);
}
public void execute(Database.BatchableContext bc, List<Account> scope) {
List<Account> toUpdate = new List<Account>();
for (Account acc : scope) {
try {
acc.Last_Synced__c = Datetime.now();
acc.Sync_Required__c = false;
toUpdate.add(acc);
successCount++;
} catch (Exception e) {
failureCount++;
errorMessages.add('Account ' + acc.Id + ': ' + e.getMessage());
}
}
if (!toUpdate.isEmpty()) {
Database.SaveResult[] results = Database.update(toUpdate, false);
for (Integer i = 0; i < results.size(); i++) {
if (!results[i].isSuccess()) {
failureCount++;
successCount--;
errorMessages.add(results[i].getErrors()[0].getMessage());
}
}
}
}
public void finish(Database.BatchableContext bc) {
Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
mail.setToAddresses(new List<String>{'admin@yourorg.com'});
mail.setSubject('Account Sync Complete: ' + successCount + ' success, ' + failureCount + ' failed');
mail.setPlainTextBody(String.join(errorMessages, '\\n'));
Messaging.sendEmail(new List<Messaging.SingleEmailMessage>{mail});
}
}The false parameter in Database.update(toUpdate, false) is critical — it enables partial success, so one bad record doesn’t roll back the entire chunk.
The Problem
Scenario: Your batch job syncs 500,000 Account records to an ERP. Three hours in, a single malformed record causes the entire job to fail, losing all progress. You have no idea which record caused the issue or how far the job got before it crashed.
The Solution
Implement Database.Stateful with Database.update(scope, false) (partial success DML). Collect failures into an errorMessages list and publish them as Platform Events in the fault path. The job continues processing all remaining chunks, and you receive a full summary email in finish() detailing exactly which records failed and why.
Pattern 2: Batch Chaining for Multi-Stage Pipelines
When a job logically has multiple phases — say, first archive records, then delete them, then send a report — chaining batches in finish() keeps things clean and respects the one-callout-per-transaction limit.
public class DataArchiveBatch implements Database.Batchable<SObject> {
private Boolean isFinalStage;
public DataArchiveBatch(Boolean isFinalStage) {
this.isFinalStage = isFinalStage;
}
public Database.QueryLocator start(Database.BatchableContext bc) {
if (!isFinalStage) {
return Database.getQueryLocator(
'SELECT Id, Name, CreatedDate FROM Audit_Log__c ' +
'WHERE CreatedDate < LAST_N_YEARS:2 AND Archived__c = false'
);
} else {
return Database.getQueryLocator(
'SELECT Id FROM Audit_Log__c WHERE Archived__c = true'
);
}
}
public void execute(Database.BatchableContext bc, List<SObject> scope) {
if (!isFinalStage) {
List<Audit_Log__c> logs = (List<Audit_Log__c>) scope;
for (Audit_Log__c log : logs) {
log.Archived__c = true;
}
update logs;
} else {
delete scope;
}
}
public void finish(Database.BatchableContext bc) {
if (!isFinalStage) {
Database.executeBatch(new DataArchiveBatch(true), 200);
} else {
System.debug('Archive pipeline complete.');
}
}
}Kick off the pipeline with Database.executeBatch(new DataArchiveBatch(false), 200). Stage 2 starts automatically when Stage 1 finishes.
When chaining more than two batch stages, pass a stage index integer instead of a boolean flag. This makes the chain easier to extend — add a new stage by incrementing the max value and adding a new else if branch, rather than refactoring the entire boolean logic.
Pattern 3: Scope-Size Tuning
The default batch size is 200. That’s often not optimal. Here’s my mental model:
- SOQL-heavy processing: keep scope at 200, queries are fast
- DML-heavy processing (many inserts per record): drop to 50-100
- Callout-per-record operations: set scope to 1 (yes, really — each callout needs its own transaction)
- Simple field updates on large tables: push to 2000 for throughput
Callout Batch (scope=1)
// Callout batch — one record per execute() to stay within callout limits
Database.executeBatch(new ExternalSyncBatch(), 1);
public class ExternalSyncBatch implements Database.Batchable<SObject>, Database.AllowsCallouts {
// ... standard structure
}Bulk Field Stamp (scope=2000)
// Bulk field stamp — maximize throughput
Database.executeBatch(new StatusStampBatch(), 2000);Pattern 4: Robust Error Logging
Silent failures are the worst kind. I always pair a batch job with an error log object. Here’s a lightweight pattern that doesn’t require a custom object if you want to keep things simple — it writes to a Platform Event instead, which decouples the error log from the batch transaction.
public void execute(Database.BatchableContext bc, List<Account> scope) {
List<Batch_Error__e> errors = new List<Batch_Error__e>();
for (Account acc : scope) {
try {
processAccount(acc);
} catch (Exception e) {
errors.add(new Batch_Error__e(
Job_Id__c = bc.getJobId(),
Record_Id__c = acc.Id,
Error_Message__c = e.getMessage(),
Stack_Trace__c = e.getStackTraceString().left(255)
));
}
}
if (!errors.isEmpty()) {
EventBus.publish(errors);
}
}Using Platform Events here is smart: even if the batch transaction rolls back, the event publish is committed. Your error log survives failures. Subscribe to the event stream with a trigger that creates Batch_Error_Log__c records, and set up a daily alert if the log object has any unreviewed entries.
Pattern 5: The Iterable Batch for Non-SOQL Sources
Database.QueryLocator handles SOQL queries. But sometimes your data source isn’t a standard SOQL query — it could be an External Object, a list of IDs from a metadata-driven config, or the result of a complex aggregation. That’s where Iterable<T> comes in.
public class IdListBatch implements Database.Batchable<Id> {
private List<Id> recordIds;
public IdListBatch(List<Id> ids) {
this.recordIds = ids;
}
public Iterable<Id> start(Database.BatchableContext bc) {
return recordIds;
}
public void execute(Database.BatchableContext bc, List<Id> scope) {
List<Account> accounts = [
SELECT Id, Name, Status__c
FROM Account
WHERE Id IN :scope
];
// Process...
}
public void finish(Database.BatchableContext bc) {}
}With Iterable, you can pass at most 50,000 records total. If you need more, use QueryLocator (which supports up to 50 million rows).
Pattern 6: Avoiding the Requeuing Trap
One anti-pattern I see constantly is chaining a batch to itself to “continue processing”. The issue is that each batch job counts against the 5 concurrent batch limit, and if the job always has work to do, you’ll fill all 5 slots and starve other jobs.
public class BatchScheduler implements Schedulable {
public void execute(SchedulableContext sc) {
Integer pendingCount = [
SELECT COUNT() FROM Account WHERE Sync_Required__c = true LIMIT 1
];
if (pendingCount > 0) {
Database.executeBatch(new AccountSyncBatch(), 200);
}
}
}Schedule this every 15 minutes via System.schedule() and you have a pull-based processor that’s polite about system resources.
Query AsyncApexJob inside your scheduler before calling Database.executeBatch() to prevent duplicate jobs from overlapping. If a job with the same class name already has a status of 'Processing' or 'Queued', skip the dispatch entirely. This is critical when your scheduler runs every 15 minutes but the batch can take 45+ minutes to complete.
Common Mistakes I Still See in Code Reviews
Expand: Common batch Apex anti-patterns
Using instance variables without Stateful. If you declare private Integer count = 0 without Database.Stateful, that count resets to zero on every execute() call. Your aggregate will always be wrong.
String concatenation in the query. Always parameterize:
// Wrong
'SELECT Id FROM Account WHERE OwnerId = \'' + userId + '\''
// Right
Database.getQueryLocator([SELECT Id FROM Account WHERE OwnerId = :userId])Ignoring AsyncApexJob for monitoring. After kicking off a batch, query AsyncApexJob to check status programmatically:
AsyncApexJob job = [
SELECT Status, NumberOfErrors, JobItemsProcessed, TotalJobItems
FROM AsyncApexJob WHERE Id = :batchJobId
];This is invaluable for building status dashboards in custom admin tools.
Putting It Together: A Real-World Data Cleanup Job
Here’s a complete, production-ready batch that combines stateful tracking, partial DML success, and chained execution:
public class StaleLeadCleanupBatch implements Database.Batchable<SObject>, Database.Stateful {
public Integer processed = 0;
public Integer errors = 0;
public Database.QueryLocator start(Database.BatchableContext bc) {
Date cutoff = Date.today().addYears(-2);
return Database.getQueryLocator(
'SELECT Id, Status, OwnerId FROM Lead ' +
'WHERE CreatedDate < :cutoff AND IsConverted = false AND Status != \\'Qualified\\''
);
}
public void execute(Database.BatchableContext bc, List<Lead> scope) {
for (Lead l : scope) {
l.Status = 'Unqualified';
l.Description = 'Auto-closed by cleanup job on ' + Date.today();
}
Database.SaveResult[] results = Database.update(scope, false);
for (Database.SaveResult r : results) {
if (r.isSuccess()) processed++;
else errors++;
}
}
public void finish(Database.BatchableContext bc) {
System.debug('Cleanup complete. Processed: ' + processed + ', Errors: ' + errors);
// Chain to deletion batch for truly dead leads
if (processed > 0) {
Database.executeBatch(new StaleLeadDeleteBatch(), 200);
}
}
}Batch Apex rewards careful design upfront. Get the lifecycle right, choose your scope size deliberately, and always log failures somewhere observable.
What’s the most painful batch Apex failure you’ve debugged in production? Was it a silent data loss issue, a scope-size miscalculation, or something more exotic? Share your war story in the comments — these are the experiences the Salesforce community learns most from.
How did this article make you feel?
Comments
Salesforce Tip