Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

Steganography With Lists

By pwayner in Technology
Tue May 07, 2002 at 12:27:20 PM EST
Tags: Security (all tags)

Is David Letterman broadcasting a secret message each night with his Top 10 list? What is VH-1 doing when it broadcasts the top 100 videos of all time? Perhaps they're sending secret messages. Any list can be rearranged to carry a hidden message. I wanted to ask the Kuro5hin participants for some help finding the best lists.

In order to plug the second edition of Disappearing Cryptography , my book on steganography, I built a Java applet that lets you hide information in a list of items. The web page explaining the technique and displaying the applet is here . The applet is on the bottom of the page.

The current version of the applet can hide a message of 175 bits by rearranging a list of 43 of the best disco songs from the 1970's. Any list of objects will do and longer lists can store more information.

The best topics are ones with no strong bias that might give away the existence of a hidden message. Is one disco song really better than another? YMMV

For instance, I thought about using a list of the 43 U.S. presidents because historians are always coming up with their own rankings of the best to the worst. This ranking could carry a hidden message, but the biases of researchers could give it away. It's somewhat unlikely that someone who ranked Ronald Reagan highly would put Calvin Coolidge or George Bush near the bottom of the list. (But even this is possible. I know one person who thinks GHWB betrayed the Reagan legacy.)

Here's my list of criteria for good lists:

  • The best lists are funny, common, and relatively random. There should be nothing suspicious about circulating them. The characters in High Fidelity are always sharing top five lists. (It's fascinating to look at the subtle differences between the lists in the movie and the book. )
  • The lists should come with no obvious order. Music and art are great topics for debate because even the most passionate will offer wildly different thoughts. Someone may love Beach Boys "Pet Sounds" but hate "Endless Summer".
  • The items should be short. This only matters if you want to be economical with bandwidth. The shortest entries are just the binary representation of the numbers between 0 and n-1. These grow very efficient as n gets large. It's possible to capture close to 100% of the bandwidth with large values of n.
  • The lists should be long. You can pack log2 n! bits into each list of n items. Longer lists mean longer hidden messages.
I hope to take the suggestions from the Kuro5hin audience and pre-load them into the applet to make life a bit easier for everyone to have some fun without typing in a list. Thanks for your interest.


Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure


List makers are
o God's first and foremost gift to a chaotic world. 7%
o Addicted to order. 19%
o Unable to live with the roiling, unordered noise of life's rich pageant. 16%
o Snobs who take pleasure in their own discriminating taste. 33%
o Nervous nebishes looking for a way to put off actually tackling chores. 23%

Votes: 42
Results | Other Polls

Related Links
o Kuro5hin
o Disappearing Cryptography
o is here
o top five lists
o Beach Boys "Pet Sounds" but hate "Endless Summer".
o long
o Longer lists
o Also by pwayner

Display: Sort:
Steganography With Lists | 64 comments (42 topical, 22 editorial, 0 hidden)
Neat (4.00 / 3) (#16)
by YesNoCancel on Tue May 07, 2002 at 10:16:40 AM EST

But the applet doesn't work. I can enter a message and hit encode, and nothing happens. When I click on decode, it returns only gibberish (like "CEAGSNEONE TR ADA TAEWWHNND LEGES SRRFE8").

Hmm. I just retested it. (3.50 / 2) (#17)
by pwayner on Tue May 07, 2002 at 10:23:04 AM EST

It worked for me. (But that's often the case with software. The programmer never makes mistakes.)

If you tell me some more I can try to debug it.

[ Parent ]
I only tried it once (none / 0) (#43)
by YesNoCancel on Tue May 07, 2002 at 03:27:06 PM EST

The password I used was "test" and the message was "verschlüsselte nachricht".

Maybe it's a problem with the umlaut as it's a character in the 128-255 range.

[ Parent ]

Yes, it's probably the umlaut. (none / 0) (#44)
by pwayner on Tue May 07, 2002 at 03:52:00 PM EST

I don't even know if I can reproduce that one here. I'll give it a try. In the meantime, use the spelling without it. That is, just add an e. If it still doesn't work, I'll try again.

Thanks for the bug report.

[ Parent ]
This gets my gears turning (4.00 / 2) (#19)
by jabber on Tue May 07, 2002 at 10:43:54 AM EST

How about embedding secret messages in artificially introduced spelling errors in a Kuro5hin article? Or in the META tags of sites that are linked to on a single page, so that there's no complete message in any one location? Oooh, I should write an article..

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"

All good ideas... (3.50 / 2) (#20)
by pwayner on Tue May 07, 2002 at 10:59:11 AM EST

At the risk of being upbraided for plugging the book, I will still point to Chapter 3 (error correcting codes) and Chapter 15 (text manipulation on the page).

The idea of using the meta tags is new to me, but splitting up a message is Chapter 4 (secret sharing).

[ Parent ]
Intentional spelling mistakes (4.50 / 2) (#22)
by Stavr0 on Tue May 07, 2002 at 11:32:28 AM EST

I remember a CD subscription from many years ago (Gartner Group??) which were reprints of all the computer magazines of the time: Byte, Infoworld etc..

They had intentionally introduced spelling mistakes as a way to track if their contents were being reproduced elsewhere without authorization. It was like a plaintext watermark :-)
- - -
All your posse are belong to Andre the Giant
[ Parent ]

Cartographers do it all the time (4.66 / 3) (#23)
by jabber on Tue May 07, 2002 at 11:38:30 AM EST

Most maps contain intentional errors, like errant streams or streets, so that if they're used without permission, this false feature can serve to identify the origins of the map.

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"
[ Parent ]

Yeah I've heard that (none / 0) (#45)
by cpt kangarooski on Tue May 07, 2002 at 03:55:42 PM EST

Dictionaries and phone books too. Humorously though, I recall that in one instance the allegedly false watermark item turned out to be real, ruining the plaintiff's case.

All my posts including this one are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
[ Parent ]
I got one better, perhaps. (none / 0) (#61)
by static on Thu May 09, 2002 at 01:10:31 AM EST

First of all, someone I used to know worked for the local street directory company. He once showed me the two "streets" - in reality, unnamed dirt tracks - he had named after his wife's cats for exactly that reason.

And there was also the completely fictional watermark street he'd named after his wife. Which was then chosen for the "how to find a street" page!


[ Parent ]

Yep, that's what I'm doing ... (3.00 / 2) (#24)
by Sir Rastus Bear on Tue May 07, 2002 at 11:52:17 AM EST

... whenever I spell anything wrong. :)
"It's the dog's fault, but she irrationally yells at me that I shouldn't use the wood chipper when I'm drunk."
[ Parent ]
OMG! (none / 0) (#54)
by gazbo on Wed May 08, 2002 at 05:25:43 AM EST

nodsmasher must be a terrorist.

Topless, revealing, nude pics and vids of Zora Suleman! Upskirt and down blouse! Cleavage!
Hardcore ZORA SULEMAN pics!

[ Parent ]

Happens also with streetmaps (none / 0) (#63)
by tetrode on Tue May 14, 2002 at 03:54:57 AM EST

In streetmaps, the publisher also makes some tiny mistakes to track if some other publisher just copies his data... Mark
________ The world has respect for US for two main reasons: you are patriotic, you invented rock'n'roll (mlapanadras)
[ Parent ]
Example of steganography (sort of) (5.00 / 1) (#29)
by BinaryTree on Tue May 07, 2002 at 12:24:06 PM EST

In one of my diary entries.

There is a hidden message in that. I didn't really mean it. I was hoping some astute K5er would notice, but it seems like nobody did. Or at least if (s)he did, (s)he didn't point it out.

Grocery Store Lists (4.00 / 2) (#31)
by westfirst on Tue May 07, 2002 at 12:28:36 PM EST

A shopping list seems ideal to me. You can have many items and it doesn't matter in what order they come. My grocery lists are all very free form documents filled with whatever tastes cross my mind.

There's some correlation between items because I do put down items that are near each other in the store. If I put down milk, I'll often list eggs if I need them. So this may not be perfect.

Actually, the best list would be the register tape. All of the items are essentially juggled when put in the cart. Then they're randomized again when put on belt. Similar items do stick together, but there's still plenty of randomness. Maybe the grocery store could "email" a receipt for records?

Does This Qualify... (none / 0) (#32)
by thelizman on Tue May 07, 2002 at 12:42:04 PM EST

...as true steganography? Not exactly as inoccuous as hiding information in the minor bits of an image, and you would need exponentially more data.

"Our language is sufficiently clumsy enough to allow us to believe foolish things." - George Orwell
Hmmm (4.00 / 1) (#33)
by pwayner on Tue May 07, 2002 at 01:15:46 PM EST

Qualify? It depends how you define the word steganography The greek roots point to simple secret writing. I feel that it includes any way of disguising information to take another form. Some point to older ideas like microdots or burst transmitters. The book concentrates on mathematical models. So I think sending 175 bits in a list of disco songs fits the bill.

It does not require exponentially many entries. Log2 n!= log2 2 + log2 3 + log2 4+ ... + log2 n is less than n log2 n. Given that n items always take log 2 n bits to represent, the process is pretty close to optimal.

For instance, if n=256, then 1684 bits or 210 bytes can be encoded. Normally, 256 objects can be represented efficiently by 256 bytes or 2048 bits. That's an efficiency of about 84%.

The efficiency gets closer and closer to 99% as n gets very large. For most practical purposes, it stays around 90%. That means you can always encode k k bits in about 1.1k bits-- if you use a minimal encoding for the items in the list. If you use disco songs, well, the items are going to be a bit larger. But they don't have to be too much larger.

Does this make sense?

[ Parent ]
Number of bits etc. (none / 0) (#49)
by phliar on Tue May 07, 2002 at 07:48:24 PM EST

By thelizman:
Not exactly as inoccuous as hiding information in the minor bits of an image, and you would need exponentially more data.
First, a rough O() statement: log2n! is approximately n log n. Now I am going to pull a lot of numbers out of my ass....

If you use a hiding document -- say a photograph -- of size H bits, you could probably hide H/24 bits in it, let's call that a 1/24 ratio. If you have a list of n things, each (on average) of size K, you could hide n log n bits in it, for a ratio of log n / K. For this to be innocuous, K is probably a list of titles, and I'd guess that an average human title is about 20 characters or 160 bits. As n increases, the ratio improves. The ratios are the same for log n = 160 / 24, which makes n = 26.66 -- about 100.

In other words, you could hide about 640 bits in a "Top-100" list that was a total of 20KB (that's kilobytes) long; or in an image that was about 20KB.

By pwayner:

Given that n items always take log 2 n bits to represent, the process is pretty close to optimal.
Well, you can't just send a binary encoded version of the sequence -- that's just a substitution cipher! If you want it to be innocuous (steganography implies that an eavesdropper does not know that a message is being sent) -- you have to send the actual list itself.

Please check my argument (and my arithmetic!) in my first section above. The argument would be much refined if log 100! was calculated exactly instead of using a O() simplification. I think it makes sense that using list order to hide information is a reasonable way of doing things.

Faster, faster, until the thrill of...
[ Parent ]

Arithmetic Again (none / 0) (#50)
by pwayner on Tue May 07, 2002 at 09:00:33 PM EST

Well, you can't just send a binary encoded version of the sequence -- that's just a substitution cipher!

Fair enough. I was just separating the claim of optimality away from the structure of the list. This is sort of fair because the code designer doesn't have control over the choice of the list items. I can only talk about the mathematical givens. But I will admit that it's a bit unfair to abstract away the cost of the steganographic payload.

I do the arithmetic a bit differently because I think you're confusing bits and bytes. If there are 20 bytes per list item, then 20k should have 1024 list items. Log 2 1024!=8769.01 bits or 1079 bytes according to Mathematica.

To be more precise, perhaps we should define some kind of believability factor-- in this case something close to 20. So we could let this be average bits per list item divided by average bits encoded in each item. In your example, this would be a factor of 18.684.

Of course, compression can be our friend too. There's really only about 3 to 4 bits of entropy per ASCII character. If I get to use this estimate, the expansion factor is about 7 to 9 for this example.

I also like to calculate an efficiency factor which is essentially the number of bits that can be stored in a list of items divided by the minimal number of bits it takes to represent the items as numbers. As you pointed out, this is just a substitution cipher.

Here's a table of the efficiency factors that I'm included just because I spent the time with Mathematica to compute them. It's badly formatted because we're denied table tags.

Items E factor
27 .799
28 .822
29 .841
210 .856
211 .869
212 .879
213 .889
214 .897
215 .903
216 .910
217 .915
218 .920

Does this make some sense? The bits and bytes confuse me some time too.

I'll note in passing that you can easily replace the least significant bit of an image-- something that produces an efficiency factor of about 8. But replacing all of the bits changes the entropy dramatically. All of the steganography detectors can pick this up. So list items seem to be a better choice.

[ Parent ]
Arithmetic.... (none / 0) (#51)
by phliar on Wed May 08, 2002 at 01:36:10 AM EST

I do the arithmetic a bit differently because I think you're confusing bits and bytes.
No, I may have screwed up the arithmetic! Let me try again:

We have two factors that we can treat as constants: one is K, the mean size of a list item. The other is f, the fraction of bits we can perturb in a picture without it being suspicious.

For the "list order" approach, the "efficiency" is (log n)/K. For small n, the picture is definitely better; at what point do they cross?

(log n)/K = f or n = 2fK
What are reasonable numbers for these parameters? I thought K = 160 and f = 1/24 were reasonable, which makes the transfer point 100. In this case, we can hide about 650 bits in a message of 100*20 bytes; or we can hide 650 bits in a picture of about 2000 bytes. This is where I screwed up: I called 2000 bytes 20KB.

Maybe f should be 1/8; this makes the crossover point n = 1 million. In a list of 1 million items (20 MB), you can hide 20 million bits; or in a 20 MB picture, you can hide 20 Mb.

Now, I didn't use any fancy-shmancy Mathematica for all this; it's all back-of-the-envelope stuff, so independent verification would be a good thing!

Faster, faster, until the thrill of...
[ Parent ]

Entropy etc. (none / 0) (#52)
by phliar on Wed May 08, 2002 at 01:48:59 AM EST

One other comment which I somehow just read past in your reply:
I'll note in passing that you can easily replace the least significant bit of an image-- something that produces an efficiency factor of about 8. But replacing all of the bits changes the entropy dramatically.
This was in fact what I equated the efficiency factor to; but I chose 1/24 instead of 1/8, since (as you write) changing the LSB of each byte will change the entropy significantly. Hence I picked (out of my ass, as I said) a ratio of 1/24.

Obviously you would want to pick something without nice smooth tones as your starting image; better to first put some gaussian noise in the LSB, and use that as the reference image. Now encoding compressed (i.e. random) data in the LSB should not be trivially detectable.

One other thing, about your applet: You're calling Giorgio Moroder disco??? Sacrilege!

Faster, faster, until the thrill of...
[ Parent ]

Maybe I'm Just A Moron (none / 0) (#35)
by Lagged2Death on Tue May 07, 2002 at 01:33:46 PM EST

Feel free to just slap me silly if I don't know what I'm talking about. It's happened before.

Assuming that when you say "image," you actually mean "photograph," then wouldn't JPEG encoding destroy any message encoded in the least-signifigant-bits of the image, since JPEG works by throwing out the little details that the human eye isn't very sensitive to?

And if instead, you transmitted your secret-message images around in a non-lossy format (BMP or TIFF or whatever), wouldn't that be 1) a dead giveaway that all was not what it seemed and 2) a much bigger waste of bandwidth than the list idea?

Starfish automatically creates colorful abstract art for your PC desktop!
[ Parent ]

Yes, but.... (4.00 / 1) (#36)
by pwayner on Tue May 07, 2002 at 01:44:39 PM EST

JPEG is a great compression tool for photographs and it does destroy information when it is used. Flipping the least significant bit to store information won't work when JPEG is around. But it turns out the same trick can be used at a different level. Algorithms like JSTEG or F5 will flip the LSB of the JPEG coefficient. Those don't get changed. It can add a bit of noise (see the photos in the book), but it's surprisingly good.

I don't know if BMPs or TIFF files would draw attention. There's sooo much inefficiency on the web. Also, some photo purisits like the extra detail. JPEG can leave artifacts.

[ Parent ]
compressed formats (none / 0) (#37)
by jfkominek on Tue May 07, 2002 at 02:32:14 PM EST

you can use compressed formats, and hide your data in suboptimal compression tables. the files would still be quite as small, but christ, who'd ever look to see if they're as small as they ought to be?

[ Parent ]
Great idea. (none / 0) (#42)
by pwayner on Tue May 07, 2002 at 03:16:38 PM EST

Some tables are pretty small, but others for multi-character huffman compression aren't. The main zip algorithm is table-less, though, if I remember correctly.

[ Parent ]
The Top N (3.33 / 3) (#34)
by Altus on Tue May 07, 2002 at 01:18:34 PM EST

Simpsons epoisodes of all time.

my friends and I often refer to a given episode as being a "Top 5" episode... it didnt take long for someone to point out that we have well over 20 or 30 "Top 5" episodes.  Its almost impossible to list your favorite simpsons episodes in order (assuming you are a sipsons fan)

seems to me that lists like this would make an excelent example as it is entirely subjective and you could easily manage a list length of 50 or more.
"In America, first you get the sugar, then you get the money, then you get the women..." -H. Simpson

Speaking of The Simpsons... (none / 0) (#39)
by curunir on Tue May 07, 2002 at 03:01:58 PM EST

...should I be worried that the current Major League Baseball standings decode to my home address?

[ Parent ]
Do you realy want to know the truth (5.00 / 2) (#41)
by Altus on Tue May 07, 2002 at 03:09:46 PM EST

or would you rather see mark maguire hit some dingers!
"In America, first you get the sugar, then you get the money, then you get the women..." -H. Simpson
[ Parent ]
Historical sidenote. (4.50 / 6) (#38)
by Apuleius on Tue May 07, 2002 at 02:49:31 PM EST

DJs were not allowed to accept listener requests in the US during World War 2. Guess why.

There is a time and a place for everything, and it's called college. (The South Park chef)
Steganography (4.50 / 2) (#40)
by IHCOYC on Tue May 07, 2002 at 03:06:15 PM EST

The concept of steganography was first proposed, IIRC, by Athanasius Kircher, whose notorious grimoire Steganographia proposes a method of secretly introducing messages into books and other texts by various means. He suggests texts in which only certain letters or characters, computed by algorithm, are significant to the secret message. Key pages can be placed over a longer text, and reveal the true message by virtue of holes punched in the sender's and receiver's keys.

The curious thing about Kircher's work is that he proposes his art of steganography in the context of a larger and unintelligible work of angelic and planetary magic, which contains the usual sorts of texts that grimoires usually contain:

Dricho mosayr vsio noes veso tureas.
Abrithios naselion pyrno chyboyn ormon.
Ceruali myrbeuo lian saueao sayr.
I hope you're not a lip reader. Of course, the combination of the two discussions in the same book is practically a challenge to uncover the secret messages that might be concealed within.

Kircher also came into possession of the Voynich Manuscript a still yet undeciphered late mediæval manuscript in an unknown language and script. A number of proposed solutions to reading this mysterious text have been proposed, and rejected. Perhaps steganography holds a clue to how its repetitive text can be read?

This message has been placed here IN MEMORIAM by the Tijuana Bible Society.

Steganography is *Much* older than that... (none / 0) (#48)
by Obvious Pseudonym on Tue May 07, 2002 at 06:22:49 PM EST

According to Herodontus, it was used in 480bc by a Greek named Demoratus who warned Greece of the immanent surprise attack by Xerxes The Great. The element of surprise was lost and Greece was saved.

Obvious Pseudonym

I am obviously right, and as you disagree with me, then logically you must be wrong.
[ Parent ]

shave heads (none / 0) (#53)
by SocratesGhost on Wed May 08, 2002 at 02:53:46 AM EST

the romans had the occasional practice of shaving the head of a messenger and tattooing a message to the scalp. When the hair grew back in, the messenger was sent. Even if captured, the messenger could show evidence that they carried absolutely nothing. As a result, they could more easily navigate across dangerous territories. This is a slightly famous application of steganography.

I drank what?

[ Parent ]
Not many critical messages (none / 0) (#58)
by porkchop_d_clown on Wed May 08, 2002 at 02:57:32 PM EST

That can wait for your hair to grow back.

I feel like I've lived my live in screensaver mode....

[ Parent ]
not every message... (none / 0) (#60)
by SocratesGhost on Wed May 08, 2002 at 10:08:58 PM EST

is critical. Sometimes, they just need to be delivered secretly.

Of course, the idea of having a whole bunch of servants whose hands have various messages at your command would be sort of amusing. Imagine the scene: you're being attacked by Vandals, and you yell out: "Hey, you. Mr. Send-more-troops-we're-being-attacked-by-Vandals. Get to Rome quickly!"

I drank what?

[ Parent ]
Agent Letterman (1.00 / 1) (#46)
by cyberdruid on Tue May 07, 2002 at 04:07:54 PM EST

If Letterman wanted to transmit secret messages, he'd do it through his tie. He changes them all the time.

Actually, yeah, he could. (4.00 / 1) (#57)
by porkchop_d_clown on Wed May 08, 2002 at 02:54:07 PM EST

They would have to be simple messages, like "Abort the misson!" or "Execute Plan A!", but it would be uncrackable.

I feel like I've lived my live in screensaver mode....

[ Parent ]
*whirrrr* (1.00 / 1) (#47)
by ramses0 on Tue May 07, 2002 at 04:29:58 PM EST

That was the sound of 100 Secret Service agents firing up their web browsers and grepping across all of iGrrrl's diary.  ;^)

[ rate all comments , for great ju

Hasn't this been done before? (none / 0) (#55)
by ksandstr on Wed May 08, 2002 at 10:58:16 AM EST

I remember seeing a program that would encode a small number of bytes in a GIF image by rearranging the order in which pixel colours would be defined.

Neat stuff, in any case.

"Perustakaamme siis maitokaivoksia, ja lehmistä saamme hiiltä."

Yes, that is called GifShuffle (5.00 / 1) (#56)
by pwayner on Wed May 08, 2002 at 11:13:32 AM EST

There's a mention on the webpage.

The big difference is that this uses a hash function to add password protection.

[ Parent ]
What lists? (none / 0) (#59)
by epepke on Wed May 08, 2002 at 06:00:54 PM EST

Lines from Monty Python. There is a vast number of them, and they're always worth repeating.

The truth may be out there, but lies are inside your head.--Terry Pratchett

My List (none / 0) (#62)
by moondog on Sun May 12, 2002 at 01:15:08 PM EST

Top phrases I think about

1. All in the family
2. Your jeans are on fire
3. Base ball in the summer
4. Are you alone?
5. Belong to me.
6. To each his own.
7. Us policy in Iraq


SNOW steganography (none / 0) (#64)
by Steve C on Wed Jun 05, 2002 at 08:48:14 AM EST

One fun program for steganography can be found here; http://www.darkside.com.au/snow/

"The program snow is used to conceal messages in
ASCII text by appending whitespace to the end of lines. Because spaces and tabs are generally not visible in text viewers, the message is effectively hidden from casual observers. And if the built-in encryption is used, the message cannot be read even if it is detected."

Steganography With Lists | 64 comments (42 topical, 22 editorial, 0 hidden)
Display: Sort:


All trademarks and copyrights on this page are owned by their respective companies. The Rest © 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!