Chapter 9 BabelNet
“A child of five could hack this network. Fetch me a child of five.”
The hour was 3:00 a.m. Elena sat staring at her laptop. It being the only light source in the room for the last three hours, her attempts at sleep were cut short by the lingering anti-flicker under her closed eyelids… (She laughed at the thought—was this a bug, or an “undocumented feature” in her occipital lobe?) Her eyes danced a frenetic, analog tango; saccades skittering, as thought after thought evaded coalescence on the question, let alone its answer. Amidst a dozen windows, each filled with the textual detritus of command-line repartee, there was one that caught her attention, draped in nothing but a single character.
Root—complete access to whatever system one was so privileged to join. The kind of hash that script kiddies smoked. If only absolute trust was so easy to detect in the real world, or for that matter, that easy to acquire.
“Do you accept this woman to be your lawfully wedded wife?”
“You may share your root password.”
Elena twirled her hair slowly, staring vaguely into the distance. How had she gotten here? Oh yeah, Fabinet. Once a music major, Elena achieved her first taste of notoriety when she managed to co-opt the speakers of all 60 desktops in her college computer lab, causing them to simultaneously erupt in a 120-part, massively surround-sound symphony. “Flight of the Valkries”—of course, Apocalypse Now style, with helicopters swirling across every node—had never sounded better, especially in the middle of a midterm.
She might have gotten in some serious trouble, had it not been for the deft suggestion that “Real-time Mixing of Massively Surround Sound within a Hostile Network” might bring tenure to her (associate) professor. Even he was impressed that the system could seamlessly adapt to any particular host dropping out of the ad-hoc orchestra, its fallen instruments or silenced conductor’s wand immediately resurrected on a nearby host. (He was less impressed by Elena’s use of Elmer’s Glue to lock the volume knob in place. By the time she had picked that lab clean, it looked like somebody had molted his skin into the garbage can.)
Mirror, Mirror on the Wall
But history would not explain what was going on now. Maybe it had something to do with the kiddies? The shell was on a honeypot machine, set up to specifically allow monitoring of “attackers in the wild” (Elena would not compliment them by calling them hackers, nor insult herself by calling them crackers.) Hmmm… what was bouncing around the honeynet, anyway? She could run a sniffer and see addresses bounce to and fro.
Most people used tcpdump. She usually preferred the vastly more elegant Ethereal, in its tethereal text mode, no less. (She had learned many a protocol on the back of tethereal -v, which dumped multipage breakdowns of every last whisper on her network.) But on this occasion, a much more direct order was required, made possible by a tool called Linkcat (lc).
Computer, take all the raw data on the network. Filter out everything readable by humans, at least eight English characters long. Give me the results.
On and on it went, electronic whispers plucked en masse from the aether. Protocols aren’t really anything more than ways for the disconnected to connect to each other. They exist among people as much as they do electronically. (It’s an open question which type of protocol—human or computer—is harder to support.) Most electronic protocols don’t stick to letters and numbers that humans can read, making it pretty simple, given all the bytes off the wire, to read only that information written in the language of people themselves. Elena vegged to the half dozen protocols, stripped of their particular identity into only what she might have the sense to read.
A Cisco switch announced to the world that it, indeed, existed, thanks to the heroic compilation of R. Heaton. A Web page was pulled down. Some other device issued universal Plug and Play commands, seeking a neighbor to play with (and potentially get plugged by, as the most serious Windows XP exploit showed). SSH2—secure shell, version 2—was rather chatty about its planned crypto exchange, not that such chattiness posed any particular threat.
And then there was SMB.
When Good Packets Go Bad
SMB, short for Server Message Block, was ultimately the protocol behind NBT (NetBIOS over TCP/IP), the prehistoric IBM LAN Manager, heir-apparent CIFS, and the most popular data-transfer system in the world short of e-mail and the Web: Windows file sharing. SMB was an oxymoron—powerful, flexible, fast, supported almost universally, and fucking hideous in every way shape and byte. Elena laughed as chunkage like ECFDEECACACACACACACACACACACACACA spewed across the display.
Once upon a time, a particularly twisted IBM engineer decided that this First Level Encoding might be a rational way to write the name BSD. Humanly readable? Not unless you were the good Luke Kenneth Casson Leighton, co-author of the Samba UNIX implementation, whose ability to fully grok raw SMB from hex dumps was famed across the land, a postmodern incarnation of sword-swallowing.
This wasn’t the only way to sniff. Chris Lightfoot’s Driftnet (http://www.exparrot.com/~chris/driftnet) had achieved some popularity. Inspired by the Mac-only EtherPEG (http://www.etherpeg.org), it spewed not text, but actual images and mp3s screaming through the network. This was great fun at wireless Internet-enabled conferences. The weblogger types had christened it the greatest method invented for tapping the collective attention span of audience members. (As a cross between columnists, exhibitionists, and vigilante quality assurance, the webloggers were always keenly interested in Who Was Hot and Who Was Not.)
But as particularly applies to reading minds, be careful what you wish for, or you just might get it. Elena wouldn’t launch Driftnet at gunpoint. Although she refused to talk about the circumstances of her phobia, it probably had something to do with that unfortunate multimedia misadventure involving Britney Spears and a goat. One was the visual, and the other was the mp3, but damned if Elena would tell anyone which was which.
Authorspeak: Paketto Borne
It was in November 2002 that I released the first version of the Paketto Keiretsu (http://www.doxpara.com/paketto). It was “a collection of tools that use new and unusual strategies for manipulating TCP/IP networks.” At least one authority had called them “Wild Ass,” but I was left with no small amount of egg on my face after a wildly bombastic original posting on that geek Mecca, Slashdot.org. A much more rational index had been posted on Freshmeat. It read as followed:
Paketto was an experiment. No, it was more than that. It was a collection of proof of concepts—an attempt to actually implement some of the amusing possibilities I’d talked about at that perennial agglomeration of hackers, hangers on, and Feds: DEF CON 10, with “Black Ops of TCP/IP.” It was an entertaining experience and quite educational. Apparently, a 12-pack of Coronas beats a Windows laptop on auto-suspend, when the judges are a 500-strong crowd of hackers, hax0rz, and all the Feds in between.
And They Say We’re Social Creatures
Elena sighed. She saw nothing, just the generic chatter of networks. And then something different fluttered by:
Ah, the old school Internet Relay Chat—IRC! It was much more readable under the Linkcat hack than Yahoo and AIM; there was no need for Dug Song’s msgsnarf to demunge the traffic. Elena laughed. Apparently, one of the (many) intruders on this network had actually set up an IRC server for himself and all of his friends to hang out in. Oh well, that was the purpose of this honeynet: Find out what people are up to and get a heads-up on just how dangerous the net really might be. Rumors that Elena’s honeynet had anything to do with the constant stream of first-run movies and Simpsons episodes that magically appeared on its 250GB Maxtor without Elena lifting a finger were completely unfounded.
Elena peered back at the screen.
WTF? Elena threw on a chat filter and sat back to watch 31ph_ and dw0rf (Tolkien would be proud) fight over a remote connection to a command prompt.
Round One: Fight!!!
Round One: Fight!!!
What the hell was this, Dungeons and Admins? Still, she was mildly impressed. These guys blew away the average graduate of the AOL Academy for Perfecter English. Somebody had to bust through the idiot filters on the honeynet. She was just about to accidentally reward them with additional bandwidth to the warez ser…honeynet when her pager went off.
A port scan? There?
Port-scanning is a curious construct. A brute-force method of discovering available network services, simply by asking for them and noting the response, it’s compared to an entire range of behaviors, legitimate and maybe less so: looking through a window, rattling a door handle, knocking on doors, or taking a survey. Elena didn’t pay too much attention to the legal rigmarole. Whatever port-scanning was, it sure as hell wasn’t particularly stealthy. At the end of the day, port-scanning involved dumping traffic on a wire, screwing up (after all, if you already knew what was open, there wouldn’t be much of a point in sending out a probe), and, oh yeah, leaving a return address for responses to come back to.
Quirky packet tricks with names like XMAS and Stealth-SYN had long since failed to hide anything. They were left-hand-blind-to-the-right-hand-style stunts that relied on the core kernel of the system doing something while not informing user software that anything was done—a sort of “silent-but-deadly” failure mode. Disabused of the notion that the kernel could be trusted to recognize the harbingers of its own demise, user software now sniffed the network directly to determine what was going on.
That’s not to say people didn’t still try to sneak scans under the radar. One popular approach was to hide their identity, masking their requests among dozens of false decoys, creating plausible deniability at the expense of vastly reduced network bandwidth.
It turned out this didn’t work very well. The nmap tool—the Rolls Royce of port-scanners, written by the “Gnuberhacker” Fyodor—would often be pressed into decoy mode, like so:
That would scan you.are.so.0wned.com, while setting up apparent decoy scans from Microsoft, AOL, and Yahoo. This led to amusing multiple-choice questions like:
Of course, resolving all those names wasn’t always advisable. A couple attackers got smart enough to operate from IP addresses whose DNS name resolution process they controlled. So, once defenders started checking through logs, seeing who was breaking into what, the attacker might get tipped off. (Checking whois records against ARIN, the IP allocation agency, was much safer, though potentially less accurate.) But DNS cuts both ways, and while name resolution isn’t critical to detecting an attack, it is often employed to mount attacks.
Unlike the Internet routes by name, addresses are immediately converted to IP, and somebody needs to do that conversion. While a couple attackers are able to run a DNS infrastructure, almost all defenders ultimately have control over their name servers. So of the four decoy IPs, the one that actually resolved you.are.so from 0wned.com was the attacker. Duh.
Of course, decoy-scanning could include decoy DNS requests, or possibly even have the scanner able to manually bounce its requests off arbitrary DNS servers. But it was, at best, a losing arms race.
At this point, Elena had many questions and precious few answers. The heavily firewalled backup network—sadly, without the time-controlled incoming access mandated by the physical security playbook—had just sent out a distress signal of Elena’s creation. Apparently, something was looking around. Now, it could have been anything from a random engineer playing with a new scanning tool to a Trojaned machine, to yet another department looking to usurp network awareness responsibilities from their rightful place behind her eyeballs. She analyzed the network alert:
Once Elena had learned about the “accidental” DNS traffic that a simple scan might spawn, it was only a matter of time before she looked for other layers that might leak useful information. DNS transformed addresses from the long, human-readable names users saw in their applications (layer 7) to the short, machine-routable addresses (layer 3) that wound their way around the net. It was necessary because the net, as a whole, didn’t grok names. But Ethernet didn’t grok IP addresses either. Ethernet needed to use these slightly longer but globally unique addresses known as MACs.
Whenever a packet was destined not for some faraway host, but instead, to a neighbor on the local network, ARP—the Address Resolution Protocol—would translate the machine-routable addresses (layer 3) to globally unique addresses (layer 2). ARP would do so by broadcasting a request, and in doing so, it could be used to expose the behavior of an impatient interloper. Mass scans had unexpected side effects (another blade that cut both ways, actually), one of which was causing a router to ARP for a large number of hosts simultaneously, all on broadcast. Therein lies the advantage: The host on which Elena had installed an ARP monitor lived on a switched network. She couldn’t convince the nimrods at IT to install an inline IDS on what was obviously an important resource. Without the inline IDS, and with the network switching traffic so she might see only frames destined for her network card, how could she detect her neighbors being scanned? She couldn’t, but she could watch the router react to carrying the scans, because it was broadcasting to anyone who would listen that it needed a huge number of addresses resolved ASAP.
That was the trigger—the oddity that demanded her interest. The next couple hours were consumed by the drudgery of examining the logs, filtering out the known, identifying the unknown, and tracing the attacker. This was the part of security work that paid the bills, the spiritual inverse of dumpster diving. But eventually, the problem was traced to a single IP: 10.10.250.89. That was the good news. The bad news was that Elena had to find this host, fast, because it had apparently been used to install backdoors on machines throughout the company. Plus, all backdoored hosts needed to be located and cleansed. It was amusing that the kid was using port 31337. Luckily, he wasn’t the only one who could wield a scanner.
Scanrand was an experiment—a very simple, very successful experiment, with a cryptographic edge rare in this kind of network code, but an experiment nonetheless. Port-scanning was historically implemented using operating system resources. The operating system kernel would be asked to initiate a connection to a given port, and after some amount of time, either the connection would work or it wouldn’t work. Then you would move onto the next host/port combination. This was very, very slow. Some scanners would simultaneously ask the operating system to connect to multiple ports, allowing it to try a couple different targets at once. This was merely very slow. The nmap tool was much better, but for all its mastery, it wasn’t perfect. It still suffered massive delays as it tried to validate that any packet it sent would, at the end of the day, elicit a response if possible.
The problem, at the end of the day, was phones. Not the devices, which still rule, but the ideas surrounding how they worked, what they were limited by, and what they could do. Phones were deep. You would call relatively few people, and you would ideally talk at length, racking up charges. It wasn’t impossible to make the Internet simulate this, and more than a few voice-over-IP companies had made quite a bit of cash doing so. But IP itself was quite unreliable; it did only what it could, and in return could be as simple, fast, and powerful as you wanted it to be. Phones were depth-oriented. Good for them, but port-scanning was breadth-oriented—talk to everybody and say almost nothing.
IP couldn’t care less what you were trying to do with your packets. That’s why it worked so well. The entire concept of IP could be summed up as, “Send it to someone who cares.” But the interfaces were all so phone-oriented. Scanrand wasn’t.
The basic idea of Scanrand was pretty simple. It split the act of scanning into two parts: one would spew the necessary packets onto the network, and the other would examine what came back. Unlike previous implementations of this idea (fping, notably), Scanrand looked not just for hosts that were up or down, but also for actual services on those hosts. Scanrand scanned TCP services statelessly; that is, without keeping track of which hosts had and hadn’t replied. Given that TCP was an entirely stateful protocol, this was somewhat of a feat. And it worked well.
A Local Scan In A Tenth Of A Second
The technique scaled, too. A single port-scan on a class B network with 65,000 hosts took only a matter of seconds to return almost 10,000 positive replies. It wasn’t stealthy. It used no invalid packets, and it required no special access. But it was power the attackers could use only at their peril and defenders could exploit at their leisure.
This was real-time auditing. It wasn’t bad for an experiment, but there was a problem.
The efficiency of stateless scanning was based on a simple presumption: Less work requires less time. (Not the most complicated presumption.) If you don’t take the time to keep track of who you sent packets to, you can send packets faster—with no memory load, either.
But what if somebody detected your stateless scan? What then? Since you weren’t tracking outgoing requests, you’d accept any received packet as if it was a response to your own scan. An attacker could confuse, misdirect, and generally manipulate your scanning engine to believe hosts were up when they really weren’t. That couldn’t be allowed.
The solution was a modern twist on an ancient technique: Inverse SYN Cookies. In 1996, attackers discovered that if they simply sent out a large number of SYN (Synchronization, or “Connection Initiated) messages to a system, the kernel, anticipating a large number of incoming connections from the outside world, would consume all sorts of valuable kernel memory preparing for all these exciting new opportunities.
Then it would die. (This was bad.)
The most elegant solution to this problem came from Professor D.J. Bernstein, of the University of Illinois at Chicago. DJB examined the structure of TCP itself. TCP, the protocol used to move web pages and email around, starts out with what’s referred to as a “three way handshake” before actually allowing data to be sent. In a nutshell, the client would send a SYN (wanna talk?), the server would reply with a SYN/ACK (sure, what’s up) or RST/ACK (go away), and the client would reply again with an ACK (nothing much). There was a measure of security to TCP, based on verification of what’s known as the Ability to Respond. Both the SYN and the SYN/ACK would contain randomly generated values known as ISNs (Initial Sequence Numbers), that would need to be specifically acknowledged in the SYN/ACK and ACK, respectively. So, to send a correct ACK, you had to receive a SYN|/CK. To receive the SYN/ACK, you had to have entered a legitimate value for your own IP address in your SYN.
So, DJB reasoned, if a small cryptographic token (and some minor additional data) was used as the ISN instead of some random bytes, the kernel could receive a SYN, send a SYN/ACK, and promptly forget about the remote host until a valid ACK—with the server-generated stamp of approval—came back. Only then would all the memory be allocated for this new and exciting connection.
Inverse SYN Cookies took this one step further. The ACK didn’t just reflect the SYN/ACK; the SYN/ACK also reflected the SYN. So a cryptographic token in the SYN would have to return in any valid SYN/ACK or RST/ACK. Linking the cryptographic token—a SHA-1 hash truncated to 32 bits, to be technical—to the IP and Port combinations that an expected SYN/ACK or RST/ACK had to have meant that an individual host could only reply for itself, not for someone else, not even for a port on itself that was not specifically scanned. It could either respond correctly, or not at all. (It could actually respond repeatedly, but since IP networks do not guarantee that a particular packet will only arrive once, this didn’t even require the target to participate in the duplication.)
This particular feature allowed some rather…useful behaviors.
For example, with all state contained in the packets themselves, IPC (inter-process communication) between the sender and the receiver, even if they were operating on different ports, came quite free. On one host, you could type this, specifying “Send Only, seed=“this_is_a_test”, spoof the IP 10.0.1.38, send to all 139(SMB) ports between 10.0.1.1 and 10.0.1.254”:
Assuming you had run the following command on 10.0.1.38, specifying “Listen Only, Accept Errors(down ports), never time out, and seed+‘this_is_a_test’ ”:
Suddenly, this might pop up.
You could even scan outside your network:
And from that very same process on 10.0.1.38, you’d see the following reply.
If you were looking, you might notice that on the local scan, everything said , but on the remote scan, port 80 (HTTP) returned a , while port 443(HTTP encrypted via SSL) returned an . What were those numbers, anyway?
They’re an estimation of how far away the remote server is, in terms of hops along the network. It’s actually possible to guess, having received any packet, just how far that packet had to travel to arrive at your host. This is because of a construct known as the TTL, or Time To Live. Each time a packet traversed yet another router on its quest to get closer to its destination, whatever value was in the TTL field of the packet—a number between 0 and 255—would be decremented by one. If the TTL ever reached 0, the packet would be dropped. This was to prevent lost packets, traveling in circles around the entire network, from permanently consuming resources. Eventually, they’d run out of steam and die.
By humans, for humans, like humans: Our own genetic structure contains telomeres, small chunks of DNA that get shaved off a bit each time our cells split. Too many shaves, and the cell can no longer spawn new cells. It’s how we age, and why we die.
All packets on IP networks require an initial TTL. Almost without exception, it always begins at 32, 64, 128, or 255. This means something interesting: If a packet was received, and its remaining TTL was 58, its initial TTL was probably decremented 6 times: 64–58=6. If a packed was received, and its TTL was 250, its initial TTL was probably decremented 5 times: 255–250=5. Since every decrement was done by a router, one could gauge the number of routers passed by the offset from one of the default values.
Sooner or later, P2P (Peer to Peer) networks would start using this to organize their virtual networks.
So why did Google’s SSL port appear 3 hops farther away? Say hello to their SSL accelerator, and possibly a separate network used to serve its content.
This wasn’t the only quirky thing one could find with TTLs:
Was the host 11 hops away, 12 hops away, or 22 hops away? Turned out a slight bug in the kernel on local.doxpara.com was adding an extra hop to a legitimate RST/ACK, but what was up with the 22-decremented packets? The firewall. Trying to be as efficient as possible, it was simply taking the incoming SYN, flipping the IPs and ports, setting the flag to RST/ACK, fixing the checksums, and sending the packet on its merry way.
What it wasn’t doing was resetting the TTL. So having already decremented 11 times coming in, it decremented another 11 times going out. Thus the legitimately down port (21) could be differentiated from the filtered ports(139, 8000, and 31337).
TTL monitoring would even occasionally find particularly nasty network hacks:
Apparently, the mail server on local.doxpara.com had teleported 15 hops closer than the rest of the network. Oh, and Microsoft had given up on Exchange.
TTLs didn’t always begin at one of the cardinal values. Traceroute—one of the oldest tools for debugging IP networks—worked by sending a packet with a TTL of 1, then 2, then 3, and so on, watching which hosts sent ICMP Time Exceeded messages back to the host in response. Of course, scanrand supported traceroute just like it supported port scans:
One could even simultaneously scan across both hosts and routes, creating a sort of “spider map” that will eventually be visualizable:
Occasionally, a trace would show a little more than expected:
Network Address Translation: Hated by many, but still astonishingly powerful and useful, NAT would translate an unroutable internal address(192.168.0.*, 172.16.*, or 10.*) into a globally routable external address. Among other things, this meant a host had no idea who the rest of the world saw it as. Scanrand could sometimes find out: Since the ICMP error elicited by the trace contained parts of the IP packet that spawned it when its TTL expired (the entire IP header, and 8 bytes of TCP, to be precise), scanrand could examine the ICMP portion to learn about what hit the global internet. This was necessary anyway to do stateless tracerouting, but sometimes more interesting things were found, as the verbose version of the above trace shows:
But the most interesting traces from scanrand actually come from its cousin tool, Paratrace. Since TCP is a Layer 4 protocol placed on top of Layer 3 IP, all IP functionality can still be tapped even when TCP is in use. That means traceroute can work over TCP—and beyond that, traceroute can work over existing TCP connections. For example, if Elena found an attacker coming in over an SSH connection, she could launch paratrace and it would tunnel back to the intruder over the TCP session they established. Though not common, this occasionally would even get through a firewall the attacker had set up, since the packets were indeed part of an established session:
Back to Our Regularly Scheduled Hackery
Given what Elena knew about Scanrand, it was easy to quickly issue a command to scan port 31337 (“elite”) across the entire corporate infrastructure, though she did need to take a moment to login to the machine the IDS was prepared to see scans from. (There was an alternative design by which the unused TCP Window Size was configured to contain a short signature of a legitimate scanner; this was to facilitate IDS cooperation with the scanrand tool. But this hadn’t been completed yet.) The results were annoying, but what could you do: 150 hosts had been obviously compromised, out of approximately 40,000 desktops. The penetration level wasn’t nearly high enough for a remote root compromise (almost all the machines were on the same image; a hole in one would have exposed a hole in all), and the machines lived across too many lines of business for an infected file server to have been the vector. She suspected a memetic virus—a cross between a standard virus (which spread without the knowledge of the user) and a Trojan Horse (which were accepted with the happy knowledge of the user, but didn’t spread), memetic viruses were Trojan Horses good enough that people sent them to their friends.
The hour was late, and there were still unanswered questions: Why did that one host execute the port scan? They probably knew about the backup network simply by observing what IP received all the backups from the desktop, but was this an insider, or somebody poking through the firewall? She had placed the Honeynet off a public DSL line; perhaps somebody had tracked its owner back to her company. But those were questions that would have to wait for another day…