One problem business people face is the proper sharing of sensitive information. The information can be anything from contracts, technical specifications, to resumes. A problem arises when you need to share or publish the information, but not all of it for confidentiality or competitive reasons. In other words, you need to hide or remove text within a document. There are right ways to do this and some ever popular wrong ways.
The term redaction may not be a household term, but is often used in the legal community. It's the practice of removing confidential or sensitive data before giving the document to others. This is different than removing hidden Meta data as the reader will see information is covered up. With the advent of privacy rules and the rise of file sharing, the rest of us should be aware of how to do proper redactions. Some people also refer to this process as sanitizing.
Common Document Redaction Methods
A popular way that people redact information is to use Microsoft's Word highlight tool and cover up the confidential text with black. This works fine if you're printing the document or faxing it to someone else. This process doesn't work well if you email or upload the file to some one else since they can easily undo the highlighting. While this is a simple method, it's not secure and should be avoided.
Next people started to convert the same Microsoft Word document to Acrobat PDF. When the recipient opened the file they would see the black highlighted text and it would print the same. On the surface, this looked like an elegant solution since people couldn't edit the text. But looks can be misleading.
What many authors didn't realize is that if you can select text in Adobe Reader and copy and paste it back to an editor, you can see the contents underneath the black highlighted text. For you skeptics, I've attached a sample PDF file so you can try this experiment on your own.
2. You can either download the file or open it in your browser.
3. Use the Select tool in Adobe Reader and select all the text.
4. From the Edit menu select Copy
5. Open an editor such as Notepad or Microsoft Word
6. Paste the text into a new document
You should be able to see the contents that were behind the black highlighted text. Don't worry; I created fictitious data just so you could see the problem.
Better Solutions to Sanitize Your Documents
There are several ways to solve this issue as the legal and intelligence communities know. One solution is use a 3rd party utility from Appligent that properly handles redaction. The company offers several flavors of the tool, but it may be too expensive for you if you have an occasional need since the starting price is about $200.
A more practical solution for most of us is a free add-in offered by Microsoft for Microsoft Word 2003. The company recently upgraded the tool to version 1.2 which fixed a compatibility problem with .Net. The program allows you to create a redacted document that you can send to others. The redacted text stays hidden even if you convert it to a PDF file. This makes it easy to send documents to others without the fear and anxiety that something confidential remains.
The download installs a small floating toolbar to Microsoft Word. To redact text, you simply highlight the words and click the Mark button.
Your redacted text displays in 25% gray shading. This may be an issue if you have other parts of your document using the same shading and color percentage.
As you might guess from the screen captures, the toolbar is very easy to use. There are a few caveats. You can use the tool on most parts of a Word document, but there are some exceptions. Specifically, the Redaction Add-in does not support redaction of:
- Content in textboxes or frames
Once you've highlighted all the text to redact, you click Redact Document. This creates another Word document. You also have the option of protecting the document as well.
Now if you were to copy the text or convert the document to an Adobe PDF file and perform the trick mentioned above, you wouldn't see your redacted text. If you look at the contents in a program editor in HEX mode, you'll see pipe signs in place of the hidden text. And no, there isn't a 1 to 1 correlation between letters and the pipe sign.
One of the items that Microsoft references in their safeguards is to use caution when redacting single words. The concern is based on reports that people can use dictionary type tools along with knowledge of your font to predict the missing word. An interesting article on this subject by John Markoff titled Illuminating Blacked-Out Words appeared in the New York in May of 2004.
I found the Microsoft Word add-in very easy to use. The primary limitation is it only works with Microsoft Word 2003. For people using older version of Word, I'd suggest reading the procedures for sanitizing documents outlined by the NSA listed in the Related Resource section. The 14 page document outlines how to use Microsoft Word and Adobe Acrobat to the safely publish documents.
- Version Reviewed: Office 2003 Add-in: Word Redaction v1.2
- Requires: Microsoft Word 2003; .NET Framework 1.1
- Cost: Free
- Snipped URL: http://snipurl.com/gtka *
- We needed to use a shorter URL as the Microsoft one was so long it created formatting problems. The link above will take you to Microsoft Word Redaction Tool page.
Last Updated (Sunday, 30 September 2012 15:13)