What happens when your data mining is flooded with SPAM? NSA's data center problem

Washingtonpost discuss the problem of the NSA data center being flooded with SPAM.

The NSA's data-collection activities are so resource-intensive, the agency can't complete its new server farms fast enough. But when it does, a significant share of what gets held on those servers could wind up being worthless spam.

We now know the NSA collects hundreds of thousands of address books and contact lists from e-mail services and instant messaging clients per day. Thanks to this information, the NSA is capable of building a map of a target's online relationships.

The abundance of SPAM is probably one of the top reasons so many users try not to use e-mail.

The writer closes making the point that part of what is stored in the NSA data center is lots and lots of SPAM.

Industry reports show spam accounts for an overwhelming share of all e-mail. Other internal NSA documents obtained by The Post's Barton Gellman appear to agree. If what the NSA is downloading is at all reflective of the broader Internet, then it's fair to conclude the agency collects a significant amount of spam and has little choice but to store it — meaning that of the "alottabytes" of storage the NSA brags about in its Utah data center, a heap of them will be filled with junk.