Mastering Complex Data Loads with Efficient Batch Flows

Instead of manually preparing complex data loads, we can use flows and custom objects to ingest CSV data, perform transforms and business logic, and then update or insert records in Salesforce.

When you need to regularly load data from a spreadsheet that requires some processing or transforming fields into records (or both) there are a few ways to accomplish this. We can use lots of VLOOKUPS on the spreadsheet with an upsert, we can use a tool like Boomi, or we can go the route I chose and use Salesforce Flow.

The Use Case

Keeping track of healthcare gaps is crucial for health insurance companies to ensure that their members receive proper medical attention and follow-up visits. My client works with these carriers to monitor the gaps in care their members have with their providers.

For example, if a member is enrolled with Humana, the insurance company wants to make sure that they have an annual check-up and eye exam. To incentivize members to keep up with their appointments, the carriers might offer cash rewards.

Every month, the insurance carriers send my client CSV sheets detailing the gaps in care for all of their members. However, the spreadsheets can be a hassle to handle as they come in different formats and sometimes have inconsistent information. To tackle this challenge, my client uses advanced techniques to process and transform the data into meaningful records, keeping track of each member's gaps in care and working towards closing them.

This carrier provides an incomplete, complete, and incentive column. There is just one row for each member:

‍

‍

This carrier provides a column with the type, and each member has a row for each type:

‍

‍

The Solution

To accomplish this, I created a streamlined, automated process. I've crafted a custom database object called Member_Care_Gap_Load__c, specifically designed to accommodate the vast amount of data from spreadsheets received from health insurance companies. With a daily check on all open records, this process will seamlessly manage the care gap tracking for each member, ensuring no lapses are left unchecked. The first type of spreadsheet, where there is one row for each member, will undergo a meticulous process of loading 6 times into the database, capturing every gap represented. The second type of spreadsheet, where each member's information is repeated multiple times, will undergo a more straightforward loading process, as it will be loaded just once into the database. Then, the scheduled batch flow will take over and seamlessly manage the rest of the process.

With 100,000 records to process, I was worried about hitting speed and limits in the system. But to my delight, the batch processing for the Member_Care_Gap_Load__c custom object was lightning fast, completing in just 15 seconds without a hitch. However, my initial approach for another batch process I have of checking every member in the database for gaps was a bit too ambitious, and it hit a limit due to the combined flow filters in the organization being capped at 250,000 records per 24 hours. I had to be strategic and put in better filters to only focus on members with potential gaps. One small mistake, such as incorrect record formatting, could cause bring an entire batch of 200 records to a screeching halt. The other batches of 200 records would still process, just the batch with the single error would fail. It's a delicate balance, but with the right approach, this process proved to be a reliable solution.

Here is my entry criteria for the batch to start:

‍

‍

And that is all the batch is, the entry criteria and the time it runs, and an element calling a subflow. I created an auto launched subflow to call from the batch and while it's not necessarily needed, it allows me to change the batch start time, without touching the flow, and vice-versa.

‍

‍

I have a Member_Care_Gap_Load__c record variable set as input in my subflow, and I simply pass the entire record in from the batch flow.

‍

‍

This is the general process in a nut shell. Is the record there -- Yes? Should the process continue? If not, it marks the original load record that kicked off the batch as processed so it’ll exit the flow and not try to process again.

‍

‍

The solution that was actually implemented is a little more complex than this as there is a sub flow to process the half year gaps on their own to make changes to any one flow a little easier. The idea is the same; decisions on what the data is, and then assignments to set the needed fields.

Into The Weeds

The first thing I do is get all my contacts into a report and use VLOOKUP to get their contact ID. The hope was to do this in the flow, but the unique ID in my org is an encrypted field, and you can’t use flow to filter on encrypted fields in a Get Records query. There are some workarounds, but I opted to use Excel and get the Record ID onto my import sheet. I made a field for the record ID on my load object.

The very first decision I have is to decide if it a record even needs to be processed. My spreadsheets have a Boolean gap-in or a date field to indicate if they have this gap. If either of those values isn’t present, I update the load record to "processed", add a message, and end the flow. There are many people on the sheet that don’t have each gap. I do a get records for the carrier record type (record type on account) and the carrier record (BCBS, Humana, etc.). I don’t have a decision on these because they should exist, but I technically should have a decision so I would know if they were missing. They are an important part of the process and should already exist.

Next, I do a GET query on the contact record. I’m doing it with VLOOKUP, but I could have used Unique ID if it wasn’t encrypted. It's important to note that not every person on the sheet is in Salesforce. I’m not creating them, so I just update the load record indicating they don’t exist. This would let me use this data for a report to see who was missing.

Finally, I check that the Care Gap and Carrier Care Gap records exist. If they don’t exist, I update the load record. These should exist prior to any record entering the flow. I setup the records using a naming convention based on gap type/carrier. Gap types may come on the spreadsheets as a code or a full name (ex. BCS or Breast Cancer Screening.)

‍

I use a CASE() formula in my flow to normalize the type. I could use Metadata Types to make this more dynamic, but it only has to be set once so I just used a formula. I also sometimes shorten carrier name in the gap name due to field length, so I have a formula to normalize how I’ve shortened names (Blue Cross Blue Shield to display as BCBS, but the actual account is the full name).

Using Get Records

Now we do a Get Record query for the contacts gap record. This is where the real work comes in. My rules are - it has to be for the carrier gap we’re loading, be between the start/expiration date ranges, and be related to the contact (ex. BCBS – BCS – Michelle Lavalette – current date)

· If it exists and was marked as closed on the sheet, we’ll update it as closed

· If it exists but was not marked as closed on the sheet, we’ll do nothing

· If it doesn’t exist we’ll create it.

‍

This is a decision element with 3 nodes, one for each of the criteria outlined.

The do nothing path is just the default path, it just goes to a final update load record to indicate it has been processed.

‍

Found and close will update the gap record with a close status/date. I have a formula date field for close date, because it was either provided or not. If it hasn't been provided, I just use the current date.

‍

The create gap path has more decisions. First I have to check if it’s an open or closed record, and set the status accordingly, as the record might just come in as closed. I have an assignment element after each decision to set a statusVar with the correct status. You could set a default for the statusVar and only use an assignment at one path. I also have a closeDateVar so I can fill in the closed date or not depending on the path. I tend to name variables with Var as the end, and formulas with F at the end, so it’s easy to see quickly which is which.

I create the member gap and fill in all the values. There's a formula to name it - member name + carrier gap name, but it's capped at a length of 80 so that it doesn’t hit the name field limit.

‍

Left({!getMemberProfile.Name}+""+{!carrierCareGapNameF},80)

Next I check to see if there is an incentive. Some gaps have them, some don’t. If it does, I create it.

‍

The final step is to mark the load records as processed.

Conclusion

When it comes to loading the same type of data into Salesforce over and over again, I found myself needing an efficient solution. With varying data but the same target objects, I needed to ensure accuracy and minimize room for human error. That's when I discovered the power of batch flows. With a simple set up, I was able to quickly and easily load all the records from my spreadsheet, updating and creating them as needed, and seamlessly linking them to the correct sub-records.

‍

Thanks to this batch flow, I no longer have to worry about the possibility of human error causing incorrect data to be created. And with the ability to quickly load the data, my work has become much more streamlined and efficient.

‍

Thank you! Your submission has been received!

Want to chat sooner? Schedule a call with us today!

Schedule a Consultation

Oops! Something went wrong while submitting the form.

Using Batch Flows to Perform Complex Data Loads

The Use Case

The Solution

Into The Weeds

Using Get Records

Conclusion

Related Insights