The Science of Hiding in Plain Sight

Although steganography is far from a new science, it arguably did not reach its full potential until the computer age. Early attempts at steganography involved things like invisible ink, microdots, or even secret messages tattooed on a messenger's body, either under the clothes or under the hair on his head. All of these pale in comparison to what is possible today.

Digital messages are composed of bits and bytes, which can be manipulated in subtle ways to achieve the desired results - we can embed a secret message into an innocent looking "container" message, and no casual observer will be alerted to anything out of the ordinary. Even most non-casual observers can be fooled into overlooking your secret message.

WHY STEGANOGRAPHY

If you have sensitive information that you wish to transmit over a potentially insecure communications channel (such as the Internet), you may wonder what advantage steganography has over, say, GnuPG for secure communications. The problem with cryptography is that it is obvious - that is, anyone who observes an encrypted message in transit can reasonably assume that the sender of that message does not want it to be read by casual observers. They may then deduce that the message contains some valuable information - information worth stealing. Steganography provides a more elegant way of communicating the sensitive information in question... by disguising it as something inane and/or otherwise uninteresting.

So, perhaps you wish to discreetly communicate sensitive information without sending up any red flags, or perhaps, like me, you're a coder who is easily mesmerized by neat data manipulation tricks and other interesting algorithms. Either way, read on.

HOW DOES IT WORK?

One of the more common approaches to steganography is to store your secret message in the least significant bit(s) of every byte of your container message. Depending on the nature of your container message, this can work quite well - for example, let's say an uncompressed audio stream or a lossless image format such as PNG.

In the case of audio, we can easily overwrite the lowest bit of data in every tenth or so byte without causing an audible loss of sound quality. In the case of image data, we can steal even a little bit more than that without the result of our manipulation being visible to the human eye. For example, consider this image - pretty boring, right? Let's embed a secret message into the lowest two bits of every byte of image data and see what it looks like: no visible differences. We can even get a little bit braver and try to steal 4 bits out of every byte of image data: here. If you look closely, you can see something appears wrong with the sky near the top of the photo, but unless you are inspecting very closely, it would be easy to write it off as simply an image quality problem - perhaps this image was scanned in from a magazine or something. It's not until we get up to six bits per byte or even eight bits per byte that you can instantly see something is wrong.

(All example images were generated with StegPng, which we'll get to towards the end of this article).

LET'S SEE SOME CODE ALREADY!

At the core of the code for a basic steganographic algorithm will be something vaguely resembling the following:

This very basic algorithm takes each byte of source data (your secret message) and breaks it into its eight component bits. For each bit in the source byte, we acquire a target byte from the container data (this could be an RGB value from your wrapper image, or the next byte of audio data from a .WAV file, or whatever). We AND the target byte with 0xFE, which causes the lowest order bit to be zeroed out. We then OR in the next bit of the source byte. Once all eight bits of the source byte have been written out, we advance to the next source byte, and so on until the entire secret message has been stegged in.