Reading Saddam's Email
What to do with an enemy's hard drives.
Feb 6, 2006, Vol. 11, No. 20 • By MICHAEL TANJI
STEPHEN F. HAYES has written extensively in these pages about a large cache of documents and digital media captured in the course of Operation Iraqi Freedom and Operation Enduring Freedom. As a former intelligence officer who dealt with digital media exploitation and analysis issues at the Defense Intelligence Agency for nearly four years (2001 to 2005), I am prohibited from speaking publicly about what these documents may contain. What I can do is share my professional opinion on how one might solve some of the major problems associated with media exploitation.
Let us assume hypothetically that the United States has overthrown a hostile regime, and a vast amount of paper and digital media has been looted or otherwise removed from the regime's ministries, industrial centers, and other facilities. A great deal of this material has been obtained by the U.S. military and eventually the U.S. intelligence services.
Because of the lack of context--reliable information about where each item was obtained, who it belonged to, and so on--U.S. intelligence is faced with trying to make sense of a massive, amorphous heap of paper and digital data.
The demands are tremendous. Combat commanders need actionable intelligence so they can turn around and capture or kill more of the enemy (and obtain still more media to exploit). But technical expertise and high-end equipment are hard to come by. So is good, trustworthy linguistic support. Subject matter experts are by and large still back in Washington. Given the problems, how does U.S. intelligence perform deep analysis on data that clearly need it?
The process of exploitation begins with the recognition that neither human intelligence nor signals intelligence is the be-all and end-all. Human sources can lie. They can hide parts of the truth. Unwitting dupes in a deception scheme can honestly tell you what they think is the truth. Intercepted signals generally reveal only part of the intelligence picture. In a complex web of bad guys, tapping the phones of one or two leaves a lot of gaps, especially when your adversary is a whole network of webs.
Digital media, on the other hand, are less prone to be a means of deception, and even one node of a network can reveal a significant amount about the entire network. Think about the data that you keep on your computers at work and at home. Unless you write fiction for a living, these are the most accurate and factual data that can be obtained about you (short of reading your mind). The memos and letters you write, the financial information you calculate, the websites you visit, and the people you email or instant-message--all this is a gold mine for anyone looking to know who you are, what you do, and with whom you cavort. Now imagine having access to the same data about your adversary.
Enter "computer forensics." Exploiting paper documents is a relatively simple matter of reading and, if necessary, translating. Exploiting digital media is another story. Before you can read the data, you have to find it.
Outside the intelligence field, computer forensics is the process by which data are extracted, preserved, and analyzed for pertinence and meaning. The computer forensics community has worked very hard to bring its practices up to the level portrayed on TV in shows like CSI, where digital evidence is now accepted in court as much as fingerprints or blood splatters.
It stands to reason that the same people, tools, and methods used in computer crime labs are also used in intelligence efforts. However, the courtroom-centric, linear, law-enforcement mindset is actually a hindrance to effective exploitation for purposes of intelligence. A military intelligence unit is not interested in going to court; it is interested in helping soldiers put steel on target. This is not to say that a law enforcement approach has no use in the larger intelligence business (for example, in counterintelligence investigations), but if the goal is good data fast, then what is good for cops is not good for soldiers.
ASSUME OUR HYPOTHETICAL hostile regime was a fairly large country with a population around 25 million. It was not the most technically advanced nation in the world, but it had ministries and industries and was believed to have advanced weapons capabilities. All these needed computers to function. How much data does this translate into? Consider some rough calculations.
One floor of an average-sized university library full of academic journals contains about 100 gigabytes of data, the size of a large but not uncommon hard drive. The data in 100 such hard drives are comparable to the print holdings of the Library of Congress. Care to guess whether our formerly hostile regime had more than 100 computers?