Improve Your Application By Running Fewer Tests

Yes, you read the headline correctly.

I don’t need to explain the importance of writing lots of tests. That should save us some time! You’ve heard it from me and others enough times now that I KNOW you’ve got like a billion apex tests in your org, because we keep going on and on about Write More Tests!

The downside of having like a billion apex tests? You are running a billion apex tests. What good is a test if you don’t run it, right? Running a billion tests takes time. Time we saved just now by not explaining how you must write more tests, sure, but other time, too. Time you need to get your application deployed, and quickly.

How, then, do we resolve this test-time continuum paradox? We run fewer tests!

Running All The Tests

Slow down, there, partner. Before we start running fewer tests, we need to all agree that we still need to run all the tests. (We still need to write all the tests, too.) Before the step in your process where you deploy to production, you need to have a step (or series of steps) wherein all of your test are executed. Exactly when will depend on your development and release process. All I know is that, at some point between a line of code being authored and that line of code getting to the production org, each line of code should be part of a run-all-tests party.

Despite the need to run all the tests, we don’t need to run all the tests all the time.

Running Tests During Production Deployment

One place you definitely do NOT need to run all your tests is on a production deployment.

Per the previous section, you’ve already run all the tests at some point in the process. Maybe you’ve run them more than once. Maybe you’ve run some of the test dozens of times; we’ll get to all of that. Regardless, you’ve run your tests. I trust you. I trust you because you’re the one who gets called at 2AM if something goes wrong, so it’s in your interest not to try and get around running these billion tests you have authored.

When tests run as a part of production deployment, they are slow. Tests consume, on average, 90% of the thread time during deployment. The reason: all tests must run on a single thread. Consequently, they cannot run in parallel.

Why, Josh? Why?? Because of our handy roll-back feature! If a single test fails, we don’t commit your deployment, to protect your application from run-time failures. In order to not-commit your deployment, we have to not-close the database connection. That means one thread and one thread only. We are stuck with a serial test operation plan.

You’re definitely going to want to run SOME tests as a part of this deployment + test + maybe-rollback-in-panic process. You want to validate that your basic functionality won’t be impacted with the deployment, in case of any differences between your production environment and the others in which you’ve previously run all the tests. You’ll want to have some confidence that there won’t be any major troubles with data integrity. Also, we’re still enforcing the 75% code coverage laws.

You do not need to run every single one of your tests in order to achieve these goals.

What you need is a “smoke test” suite. This is a series of tests that do the work of trying out the major functionality in all areas of your application. It’s a series of tests that cover 75% of your code – but you only need a single coat of coverage to meet the requirements. This is NOT a series of tests that tries all the corner cases, nor one that covers 100% of the code multiple times over. It’s your basic set of tests that, should any one of them fail, you’ll want to roll back the deployment in panic before real pandemonium sets in.

If you create a suite of tests like this, you can specify it as the tests to run upon deployment. So long as you get 75% coverage with the suite, your deployment will succeed and commit. It will succeed much more rapidly than running all the tests.

Running All The Tests, Redux

OK, we’re feeling dangerous now! We’re deploying without running all the tests. However, you may already be living dangerously by allowing people to make changes directly in your production org! <insert giphy of man drinking and firing a gun wildly into the air> How can you be sure that the changes you’re deploying won’t conflict with all these wild-west changes happening?

If you are living with bandito admins changing logic in production on the fly with no safety belt, a good place to run all the tests is a VERY recent sandbox copy. We’ll call this a staging sandbox. In the days prior your deployment, you create a sandbox copy of the latest production org. You do a deployment of the planned production deployment into this sandbox. After deploying, you launch all the tests.

Why launch tests after, and not during the deployment? Because you can run the tests in parallel after the deployment! Since you don’t need to roll back in panic if there’s a test failure in the staging sandbox, you can safely commit the deployment before running tests.

If the tests all pass in this sandbox, you are ready for a production deployment. You cross your fingers that the banditos haven’t changed anything earth-shattering since your staging sandbox was created, or you set up rules against such things in that timeframe. You do a “smoke test”-only deployment to production. You’ve run all the tests in a very similar environment, and, if your smoke tests pass, you can feel confident that your production deployment will safely succeed.

Running Tests During Development

Now let’s go back to square one: fixing one of these tests that failed in staging sandbox. Or fixing any issue. Or adding new features. Whenever you make changes, you have a billion tests that need to be run. Or do you? If you’re working on the UI for the billing form, do you really need to run tests related to the opportunity trigger?

When they’re running well, tests provide timely, actionable information. If there’s a failure, you want to convey that to the development team, so they can act on it by fixing whatever needs fixing before the problem gets too far. You want this information to arrive as quickly as possible, so your team remains unblocked.

In order to get timely results from test runs, you should break your tests up into functional groups. This could be by application. Or by development team. Or by feature. Or by the results of the last test match between India and Pakistan. Or by a friendly wager on that match. I won’t judge.

When making changes, the thing most likely to break is the thing you’re changing. Thus, the tests you’ve written for that part of the application are most likely to inform you of an issue. Those are the tests you should be running frequently, with each change.

It’s also a good idea to divide up the tests within each functional area into basic tests and extended tests. Basic tests are the happy path. Extended tests include the corner cases, the little things that happen as you deviate from the happy path. If you split these up well, you’ll find that the basic tests make up 20% of your test battery, but uncover 99% of your bugs. (Bonus fact: 73% of all statistics are made up on the spot. Look it up.)

Developers now only need to run the basic tests for their functional area when they’re checking in.

Since we are only running a subset of tests, your team is likely to actually run the tests! Nothing makes developers skip test runs faster than a four-hour wait. Shorter test runs give you the timely feedback you were looking for. Running all the tests, every time you make a change, will get you whatever the opposite of “timely” is.

Stronger As You Go

I think of a development pipeline proceeding like a video game. It’s relatively easy to get past the first board; the next board is harder, and the one after is harder still, and so on until the final boss. (RIP Bowser. You are not forgotten.) Your testing strategy should be similar in progression, adding more and more tests into the mix as you move through various stages of a development and deployment pipeline.

We’ve already discussed the process for the first stage of development. You have developers running basic tests for the functional area they’re working in before checking into the source system. (Note: you may want to throw in the smoke tests for other functional areas, too, depending on how interconnected your functional areas are. Your mileage may vary.)

As we progress in the process, testing gets more rigorous. You can set up a CI system to run a test battery after every check-in. The tests at this step may be just the extended tests for the functional area. It may include basic tests from other relevant functional areas. They key point is: the tests being run here should be the ones likely to get you maximum results (paid out in test failures) in minimum time.

A feature branch will merge to a release branch or the master branch when a release or a patch is done. This occurs after many contributions from each member of the team have been tested with the correct subset of tests. At this stage you might run all basic tests. You might run all extended tests. You might run a series of integration tests. Either way, you’re now running more tests than in previous stages of the pipeline, and you’re looking for any new information to send back to the development team. You should be seeing fewer and fewer failures as you move along, since you’ve caught the major ones in previous stages.

All of this leads up to running all the tests. At some late stage in your process, you should run them all. In a sandbox, of course, per the earlier discussion! You are confident, by this point, that you’ve found almost all the issues. You’re running all the tests now to be confident AND certain.

Full Circle

Ok, maybe my headline was misleading. Your team might run the same number of tests. Maybe more. They’re going to run them efficiently, though. The team will run the tests that are relevant, so they can fix issues before you spend the time to run every single test. The team won’t run unnecessary tests – until it’s necessary to run them. <insert giphy of mind exploding>

Setting this up may seem like a mountain to climb. There is effort involved in identifying which tests belong in which functional bucket. There is effort in dividing tests into smoke, basic, and extended. There is effort in setting up a CI system. (Have you met my new friend Mr. SFDX?)

When you get to the top, you will see that it was all worth it. You will have a development team able to rapidly iterate without waiting forever to learn about problems. You will have confidence that your application is strong, without waiting as long for test runs to complete. You will reduce the time needed to deploy your application to production. You will spend less time waiting, and more time innovating.