Backups Are Not Recovery Until You Have Restored Something
Every organisation has backups.
Usually with confidence. Sometimes with great confidence. Occasionally with the kind of confidence that suggests nobody has asked a follow-up question since 2018.
Backups are important, necessary, and are one of the few things in IT that can turn a very bad day into merely an inconvenient one.
But here is the part that tends to make the room go quiet:
A backup is not the same as recovery.
A backup says, “We have a copy.”
Recovery asks, “Can we get the business running again?”
Those are not the same question.
The green tick problem
Backup systems love a green tick.
- Job completed successfully.
- Data copied.
- Snapshot taken.
- Retention applied.
Everything looks lovely.
And to be fair, that is good news.
But a successful backup job only proves that the backup job completed. It does not prove:
- the data is usable
- the restore process works
- the right people know what to do
- the service can be rebuilt
- dependencies are understood
- recovery can happen quickly enough
- the backup is protected from ransomware
- the business can operate afterwards
A dashboard full of green ticks is comforting.
So is a fire alarm test.
Neither one means you are ready for the building to be on fire.
Backup is technical. Recovery is operational.
Backup is the act of creating a copy.
Recovery is the act of bringing a service, process, or business function back to a usable state.
That difference matters.
A file restore is one thing.
A server restore is another.
A business service restore is something else entirely.
To recover a business service, you may need:
- data
- applications
- configurations
- identity access
- certificates
- service accounts
- DNS records
- firewall rules
- integrations
- vendor support
- user communication
- business prioritisation
The backup is only part of the story.
Sometimes it is the easiest part.
Why organisations confuse backup with recovery
Because backup is visible and measurable.
You can see backup jobs, storage usage, success rates, schedules, and retention.
Recovery is harder.
Recovery asks uncomfortable questions:
- What do we restore first?
- Who declares a recovery event?
- Who has authority to make decisions?
- What if the network is unavailable?
- What if the domain is unavailable?
- What if the backup platform itself is affected?
- What if the person who understands the system is on leave?
- What if the documentation is wrong?
- What if the restore takes longer than the business can tolerate?
These are less pleasant than green ticks.
So organisations often stop at backup confidence.
Which is understandable.
And dangerous.
The “we have backups” trap
“We have backups” is one of those phrases that sounds reassuring until you start interrogating it.
Backups of what?
How often?
Stored where?
Protected how?
Retained for how long?
Tested when?
Restored by whom?
To what environment?
In what order?
With what dependencies?
Against what recovery target?
At that point, “we have backups” either becomes a robust recovery conversation or begins quietly backing out of the room.
The trap is assuming that because backups exist, recovery is assured.
It is not.
Recovery has to be proven.
Ransomware changed the conversation
In older disaster recovery thinking, backups were often treated as protection against accidental deletion, hardware failure, or site-level disruption.
Those risks still exist.
But ransomware changed the conversation.
If an attacker can access, encrypt, delete, or corrupt backups, then the backups may not save you.
This is why modern recovery thinking needs protected backups.
That might include:
- immutability
- offline copies
- restricted administrative access
- separate credentials
- backup monitoring
- tested restore points
- clear retention
- separation from production identity systems
The exact design depends on the environment.
But the principle is simple:
The thing protecting you should not be as easy to damage as the thing it protects.
Restore testing is not optional theatre
A restore test is where backup confidence becomes evidence.
Without restore testing, you are relying on assumption.
And assumption is very bad at restoring databases.
A restore test helps answer:
- does the backup actually restore?
- is the restored data usable?
- how long does it take?
- are the instructions accurate?
- are the right people involved?
- are there missing dependencies?
- does the restored system behave as expected?
This does not mean you need to run a full disaster recovery exercise every Friday while everyone wears hi-vis and someone brings a clipboard.
Start smaller.
Restore a file.
Restore a mailbox.
Restore a virtual machine.
Restore a critical application into a test environment.
Validate a recovery runbook.
Time the process.
Record what broke.
That is where the value lives.
The warning signs your recovery position is weaker than you think
You may have a backup-but-not-recovery problem if:
- backup jobs are monitored, but restores are rarely tested
- nobody knows the recovery order for key systems
- backups use the same credentials or identity dependencies as production
- recovery documentation is out of date
- only one person knows how to restore critical systems
- restore times are unknown
- backup success is reported, but recovery capability is not
- there is no clear decision-maker during recovery
- backups exist, but nobody has tested whether users can actually work afterwards
- ransomware impact on backups has not been considered
If the recovery plan depends on someone saying “we’ll figure it out at the time”, that is not a plan.
That is improvisation with storage.
What good looks like
Good recovery planning does not need to be theatrical.
It needs to be clear.
1. Critical systems are identified
You cannot recover everything first.
The organisation needs to know which systems matter most, and in what order they should return.
This requires business input, not just IT instinct.
2. Recovery ownership is defined
During disruption, ambiguity is expensive.
You need to know:
- who leads technical recovery
- who makes business priority decisions
- who communicates internally
- who contacts suppliers
- who records actions
- who signs off restored services
Otherwise, the recovery effort becomes a meeting with cables.
3. Backups are protected
Backups must be resistant to accidental deletion, malicious activity, and production compromise.
That means thinking carefully about access, immutability, separation, and monitoring.
4. Restore tests happen
Not hypothetically. Actually.
A restore test should produce evidence:
- date
- scope
- result
- time taken
- issues found
- improvements needed
5. Recovery documentation exists
The recovery process should not live entirely in one person’s head.
Because if your recovery strategy is “ask Ben”, please see Article 1. It brought snacks.
6. Recovery expectations are realistic
The business should know what recovery can actually achieve.
There is a big difference between:
- “we can restore that file today”
- “we can restore that service within four hours”
- “we can rebuild that platform within two days”
- “we do not actually know”
The last one is common.
It is also unhelpful.
How to improve recovery without causing panic
Do not start by announcing that nobody is safe and the backups are probably lies.
That tends to reduce trust.
Start with evidence.
Step 1: List critical services
Identify the systems and data the organisation depends on most.
Ask:
- what stops the business operating?
- what affects customers?
- what affects finance?
- what affects quality, traceability, engineering, warehouse, or compliance?
- what has to come back first?
Step 2: Map each service to its backup position
For each critical service, ask:
- is it backed up?
- how often?
- where?
- how is it protected?
- who owns it?
- when was it last restored?
Unknown answers are not failures. They are findings.
Findings are useful.
Step 3: Run small restore tests
Start manageable.
Pick a file share, mailbox, small system, or non-production restore.
The point is to learn, not perform disaster theatre.
Step 4: Build recovery runbooks
For critical services, document:
- what to restore
- restore order
- dependencies
- access required
- validation steps
- communication notes
- rollback or escalation points
Keep them usable.
A recovery runbook that cannot be followed under pressure is decorative.
Step 5: Review after change
Recovery plans drift when systems change.
If a system is migrated, reconfigured, replaced, integrated, or moved to cloud, the recovery approach should be checked.
A backup plan from three architectures ago is not a plan.
It is a memory.
Why this matters beyond IT
Recovery is a business issue.
IT can restore systems, but the business must decide priorities.
During a serious incident, the key questions are not purely technical:
- which service matters most?
- how much downtime is tolerable?
- what manual workarounds exist?
- who communicates with customers?
- what legal or regulatory obligations apply?
- when do we escalate externally?
- when is a restored service good enough to use?
These are business decisions.
If they are not made before disruption, they will be made during disruption.
Which is rarely when humans are at their most elegant.
The role of policy
This is where backup and resilience policy matters.
A good policy should make clear that:
- critical systems must have defined backup requirements
- backups must be protected
- restore testing must occur
- recovery ownership must be documented
- recovery documentation must be maintained
- backup exceptions must be risk assessed
- recovery capability must be reviewed after major change
That is not bureaucracy.
That is the organisation deciding it would prefer recovery to be more than a motivational statement.
Recovery is what happens when the copy becomes a working service again, in the right order, with the right access, within a timeframe the business can tolerate.
Until you have restored something, you do not fully know what you have. Confidence is not a restore point.