Thinking Beyond M & M Security
For the uninitiated, “M & M” security is a pejorative term, comparing a security posture to candy: a hardened shell with a soft, delicious interior. This comparison is apt for a few reasons, but mainly because it highlights a larger theme in information security: why do we so consistently get things so wrong?
Foot In The Door
Even a partial compromise of an edge node inside of a multi-tiered architecture built following this security strategy can be catastrophic. Privilege escalation via credential siphoning, packet sniffing, or other means, will inevitably allow complete infiltration of that edge and inevitable further penetration of the network.
In a hypothetical M & M scenario, our adversary has already attained a complete compromise of our infrastructure, or will do so as a function of time. They may now proceed unhindered to attaining their goal. In case it’s not clear, this is bad news.
Anatomy Of An Attack
For an adversary with shell access inside a network, there are a few immediate goals:
- Understand the network topology
- Siphoning credentials, keys, accounts, repositories, configs
- Privilege escalation
- Evidence destruction
- Create a backdoor whereby access can be regained later. This is done by opening a port to the internet and running an unsigned executable.
As devops engineers, we must consider each of these concerns and how best to protect against them.
At this point in the attack, the adversary already has an enormous advantage. Hindering an ongoing attack, while successfully quarantining an intrusion without service interruption to end users becomes exponentially more challenging.
Attack Mitigation And Remediation
Affected infrastructure must be disconnected, isolated, and stopped for later analysis. Compromised credentials and keys must be replaced with pristine infrastructure, and all rotated in sychronized action. If any of these actions are not automated and practiced via playbooks, considering human error under duress, the odds are not good for a quick recovery.
For any known compromised node, all assets and unsecured attached resources must be considered compromised and treated as such.
So what is a better approach? The problem is complicated, and the solution involves both infrastructure hardening and organizational preparation.
Take steps during the design phase of a platform to make actions transparent, isolation at the hardware or OS level, and mitigation/remeditation as painless as possible.
Automate Infrastructure Tests
Use infrastructure testing to automatically validate assumptions about the platform. These assumptions can range from provisioning tool correctness to security audits.
Zero Trust Architecture
Inside The Network
Use internal proxies to verify requests between platform services. Disallow direct service to service communication, via isolated networks sharing an edge proxy.
Think about access, authorization, and process execution on a whitelist basis, instead of the traditional blacklist strategy. Does the internal proxy expect a connection from executable A on port N? If not, maybe we don’t want to outright block the connection, but we do want to monitor it closely and automatically generate a context-bound report of that connection’s activities.
Have a process in place for revoking and regenerating credentials at any level of the system. These credentials should be defined with minimal permissions, have a TTL, and live in an encrypted store. Credentials can be shared across services if the services reside within the same network space. However, credentials should never be used by services across network boundaries. The reason is simple.
A Ship Metaphor
Modern ship design includes a feature called bulkheads. By partitioning the ship into sections, which can be easily shut off from one another, the ship becomes much more robust in the face of catastrophic failure.
Typically, the bulkhead pattern is applied to fault tolerance in distributed systems. The concept is the same, but applied from a security perspective.
Applied to modern security architecture, one environment might be spread out over multiple firewalls. Each firewall has tight access controls, with keys and credentials isolated from one resource to another. Compromise of one network bulkhead limits the blast radius of an attacker scooping credentials or sniffing traffic.
Follow security logging best practices recommended by OWASP. Log often, and ship logs to a log aggregation service. In addition to aiding in transparency and auditing, destruction of evidence by the attacker is much more difficult.
An organization’s employees are its most valuable asset, and this is doubly true when responding to a security incident. There is nothing more dangerous than a sentient, motivated, responsive attacker.
Plan To Be Pwned
Carve out a red team that will perform live penetration attacks again public environments. Their goal is root access, database dumps, api keys, and other credentials.
If red team is successful, measure some key metrics:
- Time to intrusion discovery
- Was the attack successfully quarantined? How long did it take?
- Was end service affected?
- Duration of the attack
- Was the scope of the breach understood?
Collecting this information will help with a blameless retro and continuous improvement for the organization.
Unsuccessful attempts are equally important. Both scenarios help develop the organization’s playbooks. This article has some more good information on the topic.
Playbooks are nothing special. They are simply defined processes for security responders to follow given a breach scenario. A typical playbook will include what systems to check, how to perform a quarantine, and how to recover with pristine infrastructure, credentials, and configs, while minimizing mean time to recovery (MTTR).
Focusing on perimeter security, to the exclusion of other concerns, is a one dimensional approach to a hard problem.
Thinking about isolation of resources within an environment during the design phase will pay off in dividends. If, in order to compromise sensitive customer information, an attacker must breach multiple firewalls and steal encrypted credentials for every section of the infrastructure… well, the time this buys is crucial for responders to mount a coherent response.
When attack vectors are defined and understood, it becomes easier to monitor and respond more quickly. Diagnosis of the depth of the intrusion should be easier too. And when responders have practiced their roles, the likelihood of mistakes being made during a stressful moment is lessened.
Breaches are terrible, and may be inevitable in an engineer’s career, but our preparation and ability to respond effectively is ours to control.