FEATURE
Surviving Hacker Attacks Proves that
Every Cloud Has a Silver Lining
By Kirby Cheng
When I first realized I'd been hacked, I thought, "Oh, my God, they really
did it!" While trying to recover from the shock, I had to call the supervisors
of the departments with services affected by the loss of the library's server.
In a seemingly calm voice, I told them one by one, "Sorry to let you know,
our server is down; it has been hacked into." Today, most libraries' resources
are Web-based. You don't have to be a server administrator to understand the
heat I was feeling when my server was hacked into on a summer day in 2002.
I work at the Franklin D. Schurz Library at Indiana University South Bend.
The library serves about 7,000 students and 300 faculty members. In late 2001,
Schurz Library bought its first server, a Dell PowerEdge 2550. As the head
of library information technology, it was my job to set up and run the server.
After "scrubbing" the machine to detect hardware defects, I installed the Windows
2000 Server operating system. Working with the library's Webmaster, I configured
the Internet Information Server (IIS). It hosted an online course Web site,
an interlibrary loan (ILL) FTP document delivery site, and a virtual reference
service. It also hosted some of the library's important files, such as those
for the periodical holdings list. In fact, the library's first server was also
the first production server I administrated independently. Although certified
by Microsoft to manage the server, I had not dealt with a complex system emergency.
Unexpected Cloudburst
The first hacker attack occurred when I was a rookie administrator. Late
at night, the campus IT department's network monitoring system detected an
unusually high volume of traffic originating from the IP address of our server.
The phenomenon resembled something that had taken place on a departmental lab
server not long before: A server had been hijacked and was being used as a
hacker's launching pad. The security surveillance system swiftly cut off our
server's network connection. The IT security officer told me that he would
resume our server's network connection only after we thoroughly investigated
the incident.
The server was a total mess: Several important library functions lay paralyzed.
Nevertheless, I realized that there was no such thing as an "escape clause" in
a server administrator's job description, so I began to shift my focus to the
cleanup. I was eager to see if the server would still boot up. Luckily, it
did. This gave me some confidence, because a working operating system would
make damage assessment and security investigation much easier. When I took
a look at the damage, I found that some critical Web files had become corrupted
and that the IIS could not function properly. I then investigated how the hacker
got into our server. To my surprise, the security logs did not catch any illegal
login attempts. I didn't find any new, unknown user accounts, and the system
privileges to alter the existing security policies hadn't been invoked. I also
found no suspicious activities after reviewing the server's baseline.
However, after examining the login records once again, I noticed that our
Web technician was online when the suspiciously large amount of data was being
downloaded. So I contacted him, and it turned out that he was a night owl who
preferred working in the evening hours when no one bothered him. On the previous
night, while he was tuning the IIS configurations after downloading some files
to his home workstation, his connection with the remote server was abruptly
severed. It seemed to me there had been no attempt to hack at that point. The
files were unintentionally damaged by the IT department staff members' "friendly-fire." When
they tried to quarantine our server, they abruptly cut off our Web technician's
remote network connection. Because some critical files were open at the time,
they became corrupted. Similar things happen occasionally when a server is
improperly shut down. This might explain why I had not caught an intruder and
why only certain Web functions had problems.
Cleaning Up the Flood
I was fairly confident that this was not an invasion, so I decided to use
the backup tapes to restore the corrupted files. Due to the risks involved
with replacing system files, I had never tried to restore the whole C: drive
or the system state data since the server had been in production. Now I had
the opportunity to practice a system recovery and to test various procedures
to resuscitate the server. It may sound contradictory to associate a "bad" situation
with a "good" learning opportunity. Yet, quite often when we survive a major
catastrophe, we're taught something that we could not have learned on a regular
day.
I suggested to the members of the IT department that they consider my friendly-fire
theory as the cause for the incident. Not quite convinced, they conducted their
own investigation. Despite the facts that they had found no trace of a break-in
and that my system recovery efforts had resumed the major IIS functions, they
decided that we needed to rebuild the server. I understood their decision.
They suspected that the incident was a security breach when it occurred, and,
just to be safe, they would rather treat it as a break-in. Following the relevant
security procedures, we rebuilt the server. We also took this opportunity to
improve the Web server's configurations.
Preparing for Stormy Days
This incident set off an alarm for me. I realized that server emergencies
were not mere scenarios in training manuals. They could happen any time to
our system, and I needed to be prepared for them. The first thing I had to
do was ensure that I would always have good backup tapes available. So in my
Server Administration Procedures, I stipulated the special occasions when a
normal (full) backup should be performed, in addition to the routine ones which
were scheduled on all workdays. For example, in order to have an image of a
clean server, I knew I would need to make a full backup after I loaded the
operating system. I would do the same thing again after I configured the IIS
or applied a major service pack; then I would archive the tapes. When I had
the original error-free critical files on the backup tapes, I would be able
to quickly rebuild the server and resume any impaired library services. To
ensure a tape would work when I needed it, I would go further to select a noncritical
file and do a trial file restoration. Along with making the tape backups, I
would update the Emergency Repair Disk after I had patched the server's operating
system so that I would have the current essential system files I needed to
recover from a boot failure.
I also decided to update my documentation. (I consider updated, accurate
documentation essential for a fast system recovery.) For instance, I would
patch the server and then record the date and the critical system patches I
applied. With this information, I could immediately decide what patches I needed
to reinstall after I had restored the system state using an earlier backup
tape. Similarly, if a red critical error appeared in the system log, I would
document its content and the time of its first occurrence so that I could choose
an error-free tape to replace the corrupted system files.
Although these procedures proved to be a good shortcut for repairing certain
system failures, using the backups to replace the problematic system files
was not a cure-all for server malfunctions. One reason was that Windows 2000
Server's built-in backup was not a full-function file backup/restoration utility.
It often failed to replace certain open system files. This meant that I would
have to be prepared for the worst-case scenario. In case the backup did not
work, I would have to remove and reinstall the related Windows components,
such as the IIS, or even rebuild the operating system. This would often lead
to reconfiguration. To be prepared for rebuilding a server from scratch, I
would document the server's baseline, user accounts, file system structure,
and file permissions for critical file folders. For a third-party-run application,
I would note the vendor's support telephone number. Often, vendors had to reinstall
their application after we rebuilt our server.
Since ours is a production server, I could not afford to let it be down too
long. To finish the reconfiguration quickly, I needed to have accurate, detailed,
system-setting records in hand. To solve this problem, I decided to use screen
shots to record the vital server components' complex configurations. Now, I
also periodically take screen shots of the security patches applied on the
server. I do this for two reasons. On the one hand, the screen shots provide
up-to-date security information for a system recovery; on the other hand, they
might well exonerate the library's staff members should we be accused of negligence.
These records prove that we make every attempt to secure the server.
Finally, fearing another lightning bolt, I began communicating more frequently
with the campus IT department staff members. I was especially careful about
establishing remote-access communications. Needless to say, I started notifying
them if a librarian wanted to use the remote desktop connection on his workstation
to work on the server. I also informed them if our library added an application
that was administrated by a remote vendor to the server.
Another Sudden Storm
Ever since the initial incident, I had been diligently patching the server
and monitoring its security logs. Nearly 2 years had passed, and no major security
breaches had occurred. However, nothing lasts forever: My string of sunny days
ended in May 2004. While I was doing a routine review of the security logs,
I found that the system had caught several HackTool spyware viruses. Lurking
in a computer, the viruses could find and decrypt login data, such as usernames
and passwords. After checking Symantec AntiVirus, I noticed that all of the
spyware viruses had been quarantined. I continued to probe various parts of
the system for evidence of possible infiltration. I found no suspicious phenomena.
There were no illegal user accounts or questionable login attempts. I also
verified the login events of the privileged users. The server's baseline was
untouched. I found no unauthorized Web functions, such as an illegal FTP site,
in the IIS. All system services running under "System and Applications" were
also justified. There was no rogue process running, either. Previously, I had
reported to the help desk workers when a similar spy virus was found on staff
workstations. They told me that as long as the virus was under quarantine,
I need not worry about it. Thinking it might be just another nuisance, I felt
somewhat relieved.
For the rest of the week, I watched the server closely. Several days passed
without viruses. However, on the following Monday, I was dismayed to see the
red errors appearing again in the Event Viewer. Over the weekend, the HackTool
viruses had come back to haunt me! Realizing that the viruses may have viciously
infiltrated our server, I reported the incident to the IT staffers. They dispatched
two more-seasoned network administrators to the library. Following the university's
security-breaches investigation procedures, we first disconnected the server's
network cable. Now, unfortunately, I had to repeat the scene described at the
beginning of this articletelling the whole library that the server was
down. To minimize the interruption of the library's services, my IT colleagues
and I quickly transferred the major Web functions to the Web servers of other
departments and resumed the affected services. We scanned the related Web files
before we reloaded them so that a virus would not spread to the new hosts.
Tracking Down the Leaks
Our second step was to investigate how the intruders managed to compromise
the existing security system and what they had done to the server. We started
our probe by looking at the files recorded by the Symantec AntiVirus. By viewing
the items logged under "Quarantine" and "Virus History," we were able to locate
the infected files. After examining the quarantined files and the files linked
with them, we surmised that the hacker had bypassed the IIS and set up an illegal
FTP site at an unconventional place in the server. Using different file names
associated with the known FTP applications as the keywords, we searched the
server. We tried to pin down the application used by the infiltrators. Eventually,
we narrowed down the names on our culprit list to EZ-FTP. We traced this to
the hackers' hidden FTP sitea folder in a "cave" surrounded by legitimate
files.
The folder was the hackers' treasure chest. Searching this chest, we found
a large number of compressed MPEG and MOV files, which did not surprise us.
What we did not expect to find were the hackers' internal working documents,
including a detailed network-scanning report of our server. Hackers often scan
networks to select a suitable target before they attack. They use the technique
to find out the server's system capability and its security environment. A
hacker's ideal prey should have two characteristics. First, its system capability
should be large enough to make hacking worthwhile. Second, its security should
be weak enough for the hackers to find loopholes. Unfortunately, for some reason,
our server met the hackers' criteria, and we fell victim to them.
In addition to the scanning report, there was a catalog of the feature films
that had been converted from DVD and stored at the FTP site. The catalog contained
many Hollywood blockbusters. Along with the catalog were the conduct codes
governing fair use of the site that held their stolen properties. One code
required the site users to limit their connection time to as short as possible
to avoid being detected. Another one warned: "Hacked, should not be hacked
again." The intruders feared that another hacker would accidentally destroy
their carefully built nest. Contrary to their stereotyped image in our minds,
these hackers did care about securitybut only when it pertained to their
illegal FTP site. As bizarre as the documents were, the real eye-opener was
a property title to our server. A group of German hackers issued the title
to themselves. Clicking the file, we saw the logo of the groupa colorful
graphic of mounted medieval knights. The first sentence read: "Team hacked,
team use [sic]." The sentence reflected the concepts of collectivism and common
property. I could not help admiring these Germans. They were true fellow countrymen
of Karl Marx, for they had integrated the Communist ideology into their documents.
One Last Lightning Bolt
Before concluding our investigation, we copied all of the hackers' files
and documented the whole investigation process. Later we would use these documents
to file a security-breach report with the higher-level IT security office.
Nevertheless, our saga continued.
A few days after we had taken back their unlawfully seized property, these
digital-age knights rode back to our campus and infiltrated numerous workstations.
Honoring their knighthood tradition, they spared our critical files, but they
made sure we knew that they were invincible and could penetrate our defense
whenever they pleased. Picking up the gauntlet, we quickly drove out the invaders
and rebuilt our network security defenses.
Our third step was to restore the server's functions. According to the university's
security regulations, a server's operating system must be scrapped and rebuilt
once its security had been compromised. While the server was down, we searched
the manufacturer's Web site and updated the server's firm ware. We verified
and reformatted the hard disks and updated the existing documentation. Naturally,
we also upgraded the server's operating system to the Microsoft Windows Server
2003. With my accurate, complete documentation, we easily reconfigured the
IIS. After patching the server and scanning its ports for potential security
risks, we reloaded the Web files that had been filtered by the antivirus software.
Finally, we did a trial run and scan of ports for vulnerabilities before we
put the server back in production.
Seeing the Silver Lining
As illustrated by my experience, when a system emergency occurs, we may feel
as if heavy clouds are overhead. However, I quickly composed myself and survived
the storm, and I turned these traumatic incidents into opportunities for developing
new technical knowledge. In addition, I gained precious hands-on experience
in security-breach investigation and system recovery.
Break-ins and other system emergencies are part of our lives as systems librarians.
Every day, we face new challenges. But one thing is constant: No matter what
happens, the sun also rises. We will find that every cloud has a silver lining.
THINGS TO BACK UP
The server with the original error-free operating
system (full backup, in archive)
The drivers of the third-party devices (in archive)
The original, error-free server after adding major
services and critical files (full backup, in archive)
The server right before applying security patches (full
backup)
The server right after applying security patches (full
backup; archive the tapes related to major service packs)
30-day routine backups (full backup)
Emergency Repair Disk (update the disk after each major
system change)
Important non-system files (back up the updated files
to a network drive)
THINGS TO DOCUMENT
Major applications run on the server Configurations of the important system services and
major applications
Server's baseline
Security patches applied
User accounts
File system structure
File permissions for critical file folders
Logs of the major system events
Logs of the initial occurrences of critical errors
Contact information for the third-party vendors
Contact information for the server's manufacturer
WHAT I LEARNED FROM THE BREAK-INS
1. Have no illusions. As soon as a server is connected to a network, it
risks being attacked.
2. The Web server must be protected by an effective firewall.
3. Apply all critical security patches appropriate for the server.
4. Know the server's baseline and be constantly alert to rogue processes
and illegal services.
5. Keep the administrators' passwords as secure as possible.
6. Try to understand hackers' mind-sets and be familiar with their most-used
techniques.
HOW I WOULD DEAL WITH A SECURITY-BREACH INCIDENT
Investigation Phase
1. Disconnect the server from the network.
2. Report the incident to the IT security officer.
3. Keep the status of the server unchanged.
4. Make a full backup of the server to preserve information.
5. Conduct a damage assessment.
6. Find out how the hacker compromised the security.
7. Document the whole investigation.
Rebuilding Phase
1. Verify and update the documentation.
2. Reformat the hard disks.
3. Update the firm ware.
4. Reload or upgrade the operating system.
5. Re-patch the server.
6. Configure major system services.
7. Test the server.
8. Filter all the non-system files for viruses, then reload the files.
9. Reconnect the server with the network.
10. Do a scan for network port vulnerabilities.
11. Test the fully loaded server.
12. Update the documentation and archive the new critical backups.
13. Put the server back in production.
References
McClure, Stuart; Scambray, Joel; and Kurtz, George (2001). Hacking Exposed:
Network Security Secrets and Solutions, 3rd ed. New York: Osborne/McGraw-Hill.
Russel, Charlie and Crawford, Sharon (2000). Microsoft Windows 2000 Server
Administrator's Companion. Redmond, Wash.: Microsoft Press.
Stanek, William R. (2002). Microsoft Windows 2000 Administrator's Pocket
Consultant, 2nd ed. Redmond, Wash.: Microsoft Press.
Kirby Cheng is head of library information technology at Franklin D. Schurz
Library at Indiana University South Bend. He holds an M.L.I.S. from the University
of TexasAustin. He is also a Microsoft Certified System Administrator (MCSA).
His e-mail is xicheng@iusb.edu. |