The Secret to Writing Ultra-Optimized Apex for Large-Scale Data Processing

The Secret to Writing Ultra-Optimized Apex for Large-Scale Data Processing

When you’re writing Apex in Salesforce, it’s not just about making the code work — it’s about making it work well. And in enterprise environments, where large-scale data processing is the norm, performance isn’t a luxury — it’s a necessity.

So what’s the secret? It’s a mix of understanding the Apex runtime engine, respecting Salesforce’s governor limits, and writing smart, scalable code. In this post, I’m going to walk you through practical tips, lessons learned in the trenches, and a few lesser-known gems to help you build Apex that doesn’t break a sweat, no matter how large the dataset.

Introduction: Why This Matters

Let’s paint the picture. You’re working on a Salesforce org with millions of records — opportunities, cases, custom objects, you name it. You deploy a trigger or batch job, and it chokes halfway through. Governor limits scream, “Too many SOQL queries!” or “Too many DML statements!”

Sound familiar?

Salesforce is multi-tenant, meaning your code has to play nice with others. There’s a ceiling — and if your code isn’t optimized, it hits that ceiling fast. But with the right approach, you can process thousands (even millions) of records efficiently.

 

The Building Blocks of Optimized Apex

Here’s where we get into the good stuff. If you want your Apex to handle large-scale data like a champ, there are a few principles you need to embrace early on.

1. Bulkification is Non-Negotiable

This is the golden rule of Apex. Salesforce processes records in batches — your code should, too.

Don’t do this: 

for (Account acc : Trigger.new) {

insert new Contact(LastName=’Test’, AccountId=acc.Id);

}

Do this instead: 

List<Contact> contactsToInsert = new List<Contact>();

for (Account acc : Trigger.new) {

contactsToInsert.add(new Contact(LastName=’Test’, AccountId=acc.Id));

}

insert contactsToInsert;

Key Reminder:

  •                     Think in terms of collections, not single records.
  •                     Apply this mindset to SOQL, DML, and calls to external services.

2. Use Collections Smartly

When dealing with large datasets, efficient use of collections — Maps, Sets, and Lists — can make or break performance.

Here’s a few pro tips:

  •                     Use Maps to relate records (Map<Id, Account>) so you can quickly reference values later.
  •                     Use Sets to avoid duplicate IDs and reduce processing.
  •                     Never use a loop inside another loop that performs SOQL/DML.

3. Avoid SOQL/DML in Loops — Seriously

Yes, it’s Apex 101. But you’d be surprised how often it’s ignored.

Bad: 

for (Contact c : contacts) {

Account acc = [SELECT Name FROM Account WHERE Id = :c.AccountId];

}

Good: 

Set<Id> accountIds = new Set<Id>();

for (Contact c : contacts) {

accountIds.add(c.AccountId);

}

Map<Id, Account> accountMap = new Map<Id, Account>(

[SELECT Name FROM Account WHERE Id IN :accountIds]

);

Why it matters:
SOQL queries inside loops multiply your query count quickly. With a limit of 100 SOQL queries per transaction, it’s a quick way to hit the wall.

4. Leverage Asynchronous Processing

Sometimes, no matter how clean your code is, the job is just too big. That’s where async Apex saves the day.

Your options:

  •                     Future methods (great for lightweight, non-urgent tasks)
  •                     Queueable Apex (chainable and better than future methods)
  •                     Batch Apex (best for processing massive volumes)
  •                     Scheduled Apex (for delayed or recurring jobs)

When to use what:

Scenario Best Option 5000+ records Batch Apex Chain jobs with context Queueable Apex Run at specific time Scheduled Apex Simple async callout Future Method

5. Indexes and Selectivity — The SOQL Power-Up

SOQL queries can fail with non-selective filters on large datasets. Use indexed fields (like Id, CreatedDate, External Id, or custom indexed fields) in your WHERE clause.

Good: 

[SELECT Id FROM Contact WHERE AccountId = :someId]

Avoid: 

[SELECT Id FROM Contact WHERE FirstName LIKE ‘A%’]

Pro Tip:
Use Query Plan Tool in Developer Console to inspect query selectivity. It tells you whether your query is likely to hit performance issues.

6. Limit Your Data Scope Intelligently

Don’t always assume you need to process everything. Sometimes, optimizing means processing only what’s necessary.

Ask yourself:

  •                     Can I filter by date or status?
  •                     Can I use a flag to mark already-processed records?
  •                     Can I store progress in a custom setting or custom metadata?

7. Use Custom Settings or Custom Metadata Types for Configurations

Avoid hardcoding logic into your Apex.

Why?

  •                     It keeps your code clean.
  •                     It allows for flexibility without deploying changes.
  •                     It improves maintainability for large orgs with multiple admins/devs.

Real-World Story: When a Batch Job Saved the Day

At one of my projects, we had to process 2 million case records to update contact information based on external input. Someone tried to run the logic inside a trigger, and the whole org slowed down.

We rewrote it using:

  •                     A batch class that handled 10,000 records per execution.
  •                     A custom metadata switch to turn it off anytime.
  •                     A Queueable Apex job to trigger the batch automatically after certain file imports.

Result?
It ran flawlessly, every single day, without tripping over limits. And users didn’t feel a thing. 

Final Thoughts: Optimization is a Mindset

Writing ultra-optimized Apex isn’t just a technical thing — it’s a way of thinking. It’s about respecting limits, planning for scale, and making your code future-proof. Salesforce gives you the tools, but you need to wield them wisely.

Remember:

  •                     Bulkify always.
  •                     Think in Maps, Lists, and Sets.
  •                     Go async when needed.
  •                     Query smart, not hard.
  •                     Test your logic with high-volume data.

FAQs

Q: Can I use Apex to update millions of records in one go?
A: Not in a single transaction. Use Batch Apex or break it down using Queueables or Scheduled jobs.

Q: What’s the best way to debug performance issues?
A: Use Logs, Query Plan Tool, and Limit methods like Limits.getDmlStatements() to monitor usage.

Q: How many records can I return in a SOQL query?
A: The maximum is 50,000 records per transaction (unless you’re in a Batch context).

 

===================================================================================================================================================================================================

===================================================================================================================================================================================================



To view or add a comment, sign in

More articles by Cloud Gyata Solutions

Insights from the community

Others also viewed

Explore topics