If you run Linux servers, the code that decides whether a login succeeds is code you have almost certainly never read, never hashed, and never watched for drift. It ships in a package, it gets installed once, and then it disappears into the category of things nobody thinks about until something breaks. That blind spot is not incidental. Three unrelated attacker groups, using three unrelated techniques, have all independently concluded that pam_unix.so and sshd are the best place in a Linux host to hide — and the evidence says they were right for years at a time before anyone noticed.

Sygnia’s June report on Operation Highland documents a China-nexus group it tracks as Velvet Ant (MITRE G1047) that backdoored the authentication stack of an air-gapped critical infrastructure network and stayed inside for close to a decade, with forensic artifacts dating back to 2016. It is not an isolated case. ESET’s Windigo/Ebury operation compromised roughly 400,000 Linux servers over fourteen years using the same basic idea — trojanize the thing that authenticates users — and still had more than 100,000 active infections as of late 2023. And in 2024 the XZ Utils backdoor (CVE-2024-3094) showed that you don’t even need a network intrusion to get there: a supply-chain implant in liblzma, linked transitively into sshd on several distributions, would have given a maintainer-trusted remote code execution primitive in the SSH login path of a meaningful fraction of the internet, caught only because a Microsoft engineer noticed sshd was 500ms slower than it should be.

Three campaigns, three delivery mechanisms, one target. That convergence is the story, and it means the auth stack deserves the same treatment your fleet gives internet-facing services — continuous integrity verification, not periodic goodwill.

What Velvet Ant actually did

Sygnia’s incident response team was called into a network with no direct internet connectivity. Velvet Ant had staged through internet-facing servers first, using a modified build of GS-Netcat for a reverse shell back to C2, then pivoted deeper using a custom binary layered on top of modified Nginx configs — an HTTP POST request to the right endpoint triggers an outbound SSH connection to an internal host — plus a custom implementation of the open-source ssspl SOCKS5 proxy for tunneling. None of that is novel tradecraft. What happens next is.

Once Velvet Ant had a foothold on enough hosts, it took over the authentication layer in three independent, redundant ways:

Layer 1 — PAM. The group replaced pam_unix.so, hooking pam_sm_authenticate, the function every login path calls to decide whether a supplied password is valid. The trojanized module accepts a hardcoded backdoor password on top of legitimate ones, and silently writes every username/password pair typed by real users to a hidden file for later harvesting. Sygnia recovered nine distinct variants, each compiled in a separate build environment — that’s a pipeline, not a one-off patch job, and it implies the group maintains this capability across engagements rather than hand-rolling it per target.

Layer 2 — OpenSSH. Modified ssh, sshd, scp, and in some builds ssh-keygen captured credentials on both inbound and outbound connections, logged every command typed in a session, and wrote the output to encrypted files disguised to blend in with normal filesystem noise. A hidden command-line flag suppressed credential logging on demand — useful when the operator wants a “clean” session for a legitimate admin nearby — and another disguised the process name in ps output. At least one variant could disable SELinux when run as root, in case anything else in the chain tripped enforcement.

Layer 3 — authorized_keys. The boring, durable fallback: attacker-controlled public keys appended to authorized_keys on compromised hosts. This survives password rotation entirely, and it’s the layer most likely to get missed because most credential-rotation runbooks don’t include an authorized_keys audit as a mandatory step.

The result is access that produces no anomalous log entries, because the component that would generate an anomalous log entry is the backdoor. There’s no failed-login spike, no new service, no unexpected listening port. The intrusion looks exactly like every other legitimate login on the box, because as far as PAM and sshd are concerned, it is one.

This is not a one-off — it’s a proven, scalable technique

It’s tempting to read Operation Highland as a bespoke nation-state capability aimed at one hardened target. The Ebury/Windigo case says otherwise. Ebury is fundamentally the same idea — an OpenSSH backdoor and credential stealer — but deployed criminally, at internet scale, since 2009. Its more recent variants inject via LD_PRELOAD against a trojanized libkeyutils.so rather than patching sshd directly, giving it userland-rootkit behavior inside live SSH sessions, plus a DGA for C2 resilience. ESET’s tooling for finding it boils down to two cheap checks: is libkeyutils loaded in a process where it has no business being loaded, and are there large (3MB+), world-writable (666) shared memory segments sitting around from ipcs -m. Four hundred thousand servers got popped by essentially the same category of attack Velvet Ant used, over a much longer window, run by financially motivated operators rather than a well-resourced APT.

XZ Utils is the third data point, and it matters because it removes the “you need to already be inside the network” precondition entirely. liblzma, embedded in a large fraction of Linux distributions and linked into libsystemd, which sshd links on distros built with the systemd notify patch, carried a backdoor that intercepted RSA public key verification in sshd and allowed a specially crafted certificate to grant remote code execution before authentication completed — planted over roughly two years of patient, socially-engineered maintainer trust-building on an open source project. The target was, again, the SSH authentication path. The delivery mechanism was different — supply chain instead of post-compromise lateral movement — but the objective was identical: control the code that decides who gets in.

Read together, these three cases aren’t a coincidence of independent researchers finding similar bugs. They’re three different attacker populations — a nation-state APT, a long-running cybercrime botnet, and (per the ongoing investigation into XZ) a suspected state-linked supply-chain operation — converging on the same conclusion: PAM and OpenSSH are high-trust, low-telemetry, rarely-audited, and catastrophic to get wrong once compromised. That’s a rational target selection, and it will keep happening until defenders change what they monitor.

Why the auth stack is the perfect hiding place

Three properties make PAM and OpenSSH structurally different from almost everything else in your threat model:

  • They’re trusted by definition. Every log line your SIEM ingests about who logged in when comes from the very binary an attacker has compromised. You cannot detect a lie using the liar’s own testimony. EDR agents that hook higher-level syscalls or watch process trees will typically see a completely normal sshd accepting a completely normal connection.
  • Nobody re-verifies them after install. Package managers verify a checksum at install time and then never again unless you explicitly run rpm -Va or debsums -c. Most fleets don’t run either on a schedule, and even fewer run file-integrity monitoring against /lib/security/, /usr/sbin/sshd, and /usr/bin/ssh specifically.
  • Remediation is genuinely dangerous. Replacing PAM modules or sshd on a live, in-use host risks an authentication lockout if the replacement binary or module is even slightly wrong — wrong dependency version, wrong distro build, missing shared library. Sygnia notes the target fleet spanned multiple Linux distributions and versions, meaning every remediation package had to be built, tested, and staged with a rollback plan per host. Admins are — correctly — afraid to touch this code path, and attackers know it.

That third point is worth sitting with. An attacker who backdoors your load balancer config gets evicted the next time someone reviews the config. An attacker who backdoors pam_unix.so gets evicted only when someone is willing to risk locking every user out of a production host to go check.

Detection playbook

None of this requires exotic tooling. It requires running checks that already exist, on a schedule, against the specific files that matter.

Verify package integrity against the distro’s checksums, not your own memory of what “normal” looks like.

1
2
3
4
5
6
7
# RPM-based (RHEL, Alma, Rocky, Amazon Linux)
rpm -Va --nomtime --nosize pam sshd openssh-clients openssh-server \
  | grep -E 'pam_unix|sshd|/usr/bin/ssh|/usr/bin/scp|ssh-keygen'

# Debian-based
debsums -c libpam-modules openssh-server openssh-client 2>&1 \
  | grep -E 'pam_unix|sshd|/usr/bin/ssh|/usr/bin/scp'

A checksum mismatch on any of these files, on a host where you didn’t just apply a security update, is a page-someone-now event. rpm -Va flags timestamp and size changes too aggressively for noisy environments — filter to content-hash (5) mismatches specifically if you want a low-noise recurring job.

Check for Ebury’s LD_PRELOAD pattern even if you think you’re only worried about Velvet Ant.

1
2
3
4
5
6
7
# Any process with libkeyutils.so loaded that isn't sshd/ssh itself is suspicious
for pid in $(pgrep -f .); do
  grep -l libkeyutils "/proc/$pid/maps" 2>/dev/null && ps -p "$pid" -o pid,comm,cmd
done

# Large, world-writable shared memory segments
ipcs -m | awk '$5 ~ /666/ && $4+0 > 3000000 {print}'

Audit every authorized_keys file on every host, not just the ones you remember creating — including service accounts.

1
2
3
4
for home in /root /home/*; do
  f="$home/.ssh/authorized_keys"
  [ -f "$f" ] && awk -v h="$home" '{print h": "$0}' "$f"
done | grep -v "$(cat known_good_keys.txt)"

Run this against a maintained allowlist (known_good_keys.txt) generated from your config management source of truth, not a snapshot of “keys that were there last time someone looked.” Compromised systems are exactly where “last time someone looked” is doing the damage.

Baseline file-integrity monitoring on the specific paths that matter, before you need it. AIDE is unglamorous but it works, and the entire point is that the baseline has to predate the compromise:

1
2
3
4
5
6
7
8
# /etc/aide.conf additions
/lib/security f+p+u+g+s+m+c+sha512
/lib64/security f+p+u+g+s+m+c+sha512
/usr/sbin/sshd f+p+u+g+s+m+c+sha512
/usr/bin/ssh f+p+u+g+s+m+c+sha512
/usr/bin/scp f+p+u+g+s+m+c+sha512
/usr/bin/ssh-keygen f+p+u+g+s+m+c+sha512
/root/.ssh/authorized_keys f+p+u+g+s+m+c+sha512

If you’re standing this up for the first time on a host you believe is already compromised, you’ve just baselined the backdoor as “known good.” That’s not a hypothetical caveat — it’s the exact reason FIM has to be deployed at provisioning time, from a golden image you trust, not retrofitted onto a fleet that’s already years into production.

Watch for behavioral tells that don’t depend on trusting sshd’s own output: SELinux/AppArmor being disabled on a host where config management says it should be enforcing; ps process names that don’t match their /proc/<pid>/exe target; shell history gaps that correlate with known admin login windows but show no corresponding auth log entry from a different, independently-logging source (auditd’s execve records, a bastion host’s own session recording, or cloud provider serial console access logs — anything that isn’t sshd attesting to its own behavior).

Remediation without bricking your fleet

Given the lockout risk, don’t patch in place on hosts where you can’t tolerate an outage. Stand up a clean replacement from a known-good golden image or fresh package install, migrate the workload, and decommission the old host rather than trying to selectively replace pam_unix.so and sshd under it. For hosts where that’s genuinely not feasible — physically isolated OT/ICS boxes, appliances with no build pipeline — build and test the replacement binaries in a lab that mirrors the exact distro and kernel version first, keep a known-good rescue path (console access, a secondary auth method that doesn’t route through the compromised stack) available before you touch anything, and stage the rollout host-by-host rather than fleet-wide.

Takeaways

  • The authentication stack is now a named target category, not an edge case. Velvet Ant, Ebury, and XZ Utils each picked PAM/OpenSSH independently, across nation-state, criminal, and supply-chain attack models. Budget for defending it accordingly.
  • Signature and IOC-based detection cannot see this. The compromised component is the one generating your logs. Detection has to move to integrity verification (package checksums, FIM) and cross-source correlation (auditd, bastion recording, out-of-band console logs) that doesn’t depend on the auth stack telling the truth about itself.
  • Run rpm -Va / debsums against PAM and OpenSSH packages on a schedule, today. It’s a five-minute cron job that would have caught all three campaigns described here.
  • Baseline FIM before you need it, from a golden image. Retrofitting AIDE onto a live, possibly-compromised fleet just certifies the backdoor as normal.
  • Audit authorized_keys fleet-wide against a config-management source of truth, including service accounts. It’s the persistence layer that survives password rotation and gets forgotten in almost every incident response runbook.
  • Treat “we’re afraid to touch sshd in production” as the finding, not just the obstacle. If your remediation plan for a compromised auth stack is “replace the host,” your prevention plan should be built around never letting the live host’s binaries go unverified long enough for that to become necessary.