Large Data Volume Strategies in Salesforce

I’ve worked on orgs ranging from a few thousand records to over 50 million, and the difference in how you need to think about data is enormous. What works perfectly at 100K records can bring an org to its knees at 5 million. Large Data Volume (LDV) is one of those topics that’s rarely taught until something breaks in production.

Let me walk through the strategies I rely on — from the quick wins to the architectural decisions that require planning from day one.

What Salesforce Considers “Large Data Volume”

Salesforce’s official LDV threshold is typically cited as 1 million records per object. In practice, you’ll feel the pain earlier — often around 500K records — depending on your query patterns, the number of fields, and the automation layered on top.

The symptoms are predictable:

List views time out
Reports take minutes instead of seconds
Apex queries hit governor limits
Batch jobs slow to a crawl

ℹ️ Info

None of these are inevitable. They’re architecture problems with known solutions.

LDV Decision Framework

The Problem

Scenario: A manufacturing org’s Case list views and reports started timing out after passing 2 million records. The SOQL queries were filtering on Status__c and Region__c — both unindexed fields — causing full table scans on every load.

The Solution

A custom index request was submitted to Salesforce Support for both fields. After the 2-day turnaround, list view and report query times dropped from 30+ seconds to under 2 seconds. No code changes, no schema changes — just an index.

Custom Indexes — The Fastest Win

Standard Salesforce indexes cover a handful of fields: Id, Name, OwnerId, CreatedDate, SystemModstamp, RecordTypeId, and fields used in external IDs. Every other field runs a full table scan.

If your queries filter on a non-indexed field — say, Status__c or Account_Region__c — you can request a custom index from Salesforce Support. This is free and typically takes 1-2 business days.

Fields that benefit most from custom indexes:

Fields used in WHERE clauses across large result sets
Fields used in report filters with millions of rows
Fields used in list view filters

Index limitations to know

Null values are not indexed by default (Salesforce indexes non-null values only)
Multi-currency orgs may have restrictions
Indexes on formula fields are not supported
For fields with skewed data (e.g., 80% of records have Status__c = 'Closed'), a standard index won’t help. You’ll need a two-column index or a different query strategy.

Skinny Tables

A skinny table is a Salesforce-internal feature where Support creates a secondary, narrow table that contains only the columns your most-common queries need. Queries against that object read the skinny table instead of the full table, which is dramatically faster.

💡 Pro Tip

Skinny tables are ideal when you have a report or list view that pulls 5-10 fields from an object with 5 million+ records, and the query runs frequently (dashboard, home page, API integration). Request them from Salesforce Support — specify the object and the exact fields you need.

Query Optimization in Apex

Even with indexes in place, poorly written SOQL will hurt you at volume:

Bad Pattern

// BAD: retrieves all fields, no selective filter, no limit
List<Case> cases = [SELECT Id, Subject, Status, Account.Name FROM Case];

Optimized Pattern

// BETTER: filter on indexed fields, limit columns, use LIMIT
List<Case> cases = [
    SELECT Id, Subject, Status
    FROM Case
    WHERE Status = 'Open'
    AND CreatedDate >= :Date.today().addDays(-30)
    ORDER BY CreatedDate DESC
    LIMIT 200
];

Rules I follow for LDV queries:

ℹ️ LDV Query Rules

Always filter on indexed fields — use CreatedDate, OwnerId, or your custom indexed fields
Use date literals (LAST_N_DAYS:30) where possible — they’re index-friendly
Never use != or NOT IN on large sets — these force table scans
Avoid OFFSET for pagination on large result sets — use cursor-based pagination with WHERE Id > :lastId instead

Batch Apex at Scale

Standard Batch Apex has a default chunk size of 200 records. For LDV processing, I often reduce this:

Database.executeBatch(new MyBatchClass(), 50);

Why smaller batch sizes help at scale

Smaller batches mean more transactions, but each transaction is less likely to hit CPU or DML limits. For read-heavy operations (no DML), you can often go higher — 2000 records per chunk.

For very large datasets, use indexed WHERE clauses so the initial query completes fast:

public Database.QueryLocator start(Database.BatchableContext bc) {
    return Database.getQueryLocator([
        SELECT Id, Status__c FROM My_Object__c
        WHERE Status__c = 'Pending'
        AND CreatedDate < :Date.today().addDays(-7)
    ]);
}

💡 Use the Archive Flag Pattern Before You Hit LDV

Add an Is_Archived__c checkbox to any object that will grow large. Exclude it from all list view, report, and trigger filters from day one. When record counts climb, flip older records to archived in a nightly batch job. Your active working set stays small, your automation stays fast, and you avoid the painful retroactive work of adding archive logic after slowdowns appear.

BigObjects — Unlimited Historical Storage

BigObjects are a Salesforce feature for storing massive volumes of data (billions of records) directly in the platform. They don’t count against your standard storage limit.

When to Use

Audit trails and compliance logs
Historical transaction records
Email send history, click/open tracking
Any data that’s written once and rarely updated

Limitations

No standard UI — you must access data via SOQL, Apex, or API
No triggers, workflows, or flows on BigObjects
You can only query on the indexed fields defined in the object’s index
No UPDATE — you insert, you delete, but you can’t update individual rows

// Insert to BigObject
My_Archive__b record = new My_Archive__b();
record.Account_Id__c = accountId;
record.Event_Date__c = Datetime.now();
record.Event_Type__c = 'StatusChange';
Database.insertImmediate(record);

External Objects — Data Where It Lives

If historical data lives in an external system (data warehouse, legacy ERP), you don’t need to import it into Salesforce. External Objects via Salesforce Connect let you query external data as if it were native, using OData or custom adapters.

ℹ️ Info

The benefit: zero Salesforce storage consumption, data stays current at source. The tradeoff: each query makes a real-time callout to the external system, so latency matters.

Data Partitioning Strategies

For record-triggered automations and reports, partitioning your data logically can prevent slowdowns. Common patterns:

Archive flag: add an Is_Archived__c checkbox, filter it out of all reports and triggers
Fiscal year partitioning: use a formula field to assign records to a year partition, build separate list views per year
Soft delete pattern: set Is_Deleted__c = true instead of hard-deleting, batch purge nightly

Why the archive flag pattern is so powerful

The archive flag pattern is particularly powerful because it requires minimal schema change but dramatically reduces the working set that triggers and reports operate on. A single checkbox, added proactively, can defer the need for BigObjects or external objects by years.

The Conversation to Have Before You Build

The most valuable LDV work happens before a record is written to the database. When designing a new object or integration, I always ask:

How many records will this hold in 3 years?
What fields will we filter on in reports and list views?
Do we need all historical records accessible in the UI, or just recent ones?
Are there compliance requirements that force retention of old records?

Answers to these questions determine whether you need custom indexes, skinny tables, BigObjects, or external objects — before you’ve written a line of code.

What LDV problem is your org currently struggling with, and which of these strategies are you considering? If you’ve used BigObjects in production I’d especially love to hear how you handled the lack of UI access — share your approach in the comments.

Test Your Knowledge

What is the fastest, lowest-effort fix for slow list views on an object with 2 million records?

Why should you avoid using OFFSET for pagination on large data volumes?

How did this article make you feel?

Comments

Salesforce Tip

We'll be right back

Large Data Volume Strategies in Salesforce

What Salesforce Considers “Large Data Volume”

LDV Decision Framework

Custom Indexes — The Fastest Win

Fields that benefit most from custom indexes:

Skinny Tables

Query Optimization in Apex

Bad Pattern

Optimized Pattern

Rules I follow for LDV queries:

Batch Apex at Scale

BigObjects — Unlimited Historical Storage

When to Use

Limitations

External Objects — Data Where It Lives

Data Partitioning Strategies

The Conversation to Have Before You Build

Test Your Knowledge

Comments

Contents

We'll be right back

What Salesforce Considers “Large Data Volume”

LDV Decision Framework

Custom Indexes — The Fastest Win

Fields that benefit most from custom indexes:

Skinny Tables

Query Optimization in Apex

Bad Pattern

Optimized Pattern

Rules I follow for LDV queries:

Batch Apex at Scale

BigObjects — Unlimited Historical Storage

When to Use

Limitations

External Objects — Data Where It Lives

Data Partitioning Strategies

The Conversation to Have Before You Build

Test Your Knowledge

Comments

Related Articles

You finished this article!

Contents