In a market which value is ascribed to novelty
Not everyone does that, I for one, do not. It's not novelty that counts, it's quality. And I won't let the market dictate what is valuable or not. Otherwise one might argue that the new books that were released in the Dark Ages were more valuable than the old paper they were written on, often containing gems by ancient authors only rediscovered more than 1000 years later.
So I repeat: Most valuable books aren't digitized and won't be in some time to come (although of course their percentage will grow smaller as new high quality books are released digitally and the old content slowly becomes obsolete). I'm not using the term "value" in an economic sense here since this would be utterly pointless.
Typical economic life for published materials is less than five years, even if you exclude periodicals.
This has various reasons, one of them being the marketing machine that is currently used to sell books. Since we do not have proper rating mechanisms for books yet, it's no surprise that the market is dictated by the publishers' marketing. However, this doesn't say a thing about the actual books' quality. I know excellent books (fiction and nonfiction) from 1995, 1990, 1985, 1980, 1970, 1950, most of them are not digitized (to my knowledge), which is a pity.
The materials will be digitized by publishers, libraries, and individuals, for archival, storage, storage-reduction, and research purposes.
This is certainly a hope I share with you, and one should also hope that they will be accessible after digitizing. Right now, I am doubtful about it. The director of the LOC, for example, has explicitly stated that they don't want to digitize their content for copyright reasons.
Works which have a current value will be digitized.
Even in an economic sense, this is not necessarily true. I certainly hope that publishers will re-release out of print books in electronic form to make more money with them, but I wouldn't hold my breath; right now, most of them are scared shitless of ebooks and the net. And the people who scan books are usually very few dedicated individuals with their own special definition of quality, which has little to do with the current economic value of the book.
The act need only be done once;
Theoretically, yes. Pratically, I am pretty sure that more books rest electronically on some people's harddrives than are circulating on the Internet. Scanned for the purpose of searching/indexing, but not distributed due to fear of prosecution. But as you said, this may change over time, and I'm hopeful that at least the number of ebooks will soon increase dramatically.
I've photocopied more than one book by hand
So have I.
courtesy a former life producing college readers at a large national photocopy chain. Ten to twenty pages a minute is a reasonable rate -- that's 600 pages an hour sustained.
I have gotten similar rates, but scanning is much slower, about half that speed with a $1000 scanner like mine.
The process today actually does do just what I mention below: most current mid to high-level
photocopiers actually make a digital, not electrostatic, image of the material copied.
Yeah, but you don't get the bytes outta there (or do you?), and you'll hardly be available to afford one for your home.
While OCR and manual re-editing are not trivial, the tasks are sufficiently
simple that a person with access to quite mundane technology in the US, EU, or Japan, could readily convert a book over a weekend or so.
Make that two weekends. It's a lot of work, and most people don't do it. It's the largest hindrance, that's really not very hard to see. Unless it's done collaboratively, it's simply too much effort. Converting a CD to MP3 takes less than an hour and is completely automatic. Digitzing a book is a boring ~10 hour effort. Trust me on this, take www.pfaffenspiegel.de as an example for a book I've scanned, proofread and HTMLized. But well, my quality standards may be higher than yours.
Where individual incentive to do so exists, it will happen.
Oh, isn't that always so?
Multiply this by several hundred million computers in the world
Doesn't compute. There aren't several hundred million computers in the world with scanners. Of those who have scanners, only a small percentage has acceptable OCR software like OmniPage Pro. And the percentage of these willing to scan whole books is much lower again.
A well-stocked megabookstore in the US might have a hundred thousand titles. This could be digitized literally overnight if 0.1% of all computer users took it into their minds to do so
You will never find enough people, unless you organize the scanning & proofreading of indivudal books collaboratively. Even then, it will be hard to digitize little-known, high quality titles but at least those of high "economic value" should be digitized. As I said, Freenet might be a good solution here.
Scanning and OCR are already cheap enough to be household technologies
No. Cheap scanners have terrible speeds and the bundled OCR software is crap.
And outside of Germany? (/me ponders turning the old US-centrist debate around and asking when Germany became the center of the Universe).
Well, I wouldn't be surprised if other countries passed similar laws. But I can only talk about my own situation here.
But they are portable, and that is the sufficient precondition for the remaining processing.
No, if they're too low DPI, they can only be put to the next stage through human typing, at least with current technology (things will change, sure).
I: [That's rather uninteresting, IMHO. You can't prevent something from "leaking out" while at the same time trying to run a business by spreading it. ]
To the contrary, this is precisely the question for the music and publishing industries. Watermarking, digital rights protections, copy protection, DeCSS, electronic paper,
Watermarking: wrong concept, needs software to control user, this will be easily cracked.
digital rights protections/copy protection/DeCSS: The law question is extremely important, I agree about that. Other than that, these issues are mundane, it's obvious that copy protections don't work.
Electronic paper: Yeah, but the information has to get into the "paper" somehow, so that's not an issue.
Napster, Gnutella, FreeNet
These are not important for cracking the content protection. You said "how are the people who have to work with digitized material going to keep it from leakin out?" and I think this refers to content protection mechanisms. I just think -- and you probably agree with me -- that whether or not copy-protection systems are secure or not is not a real question for anyone who knows what he's talking about, copy protections can never work. Of course it will be interesting to see how long the content industry takes to realize that, and how they'll fight the distribution mechanisms and the crackers. Of course new distribution mechanisms and the way the content industries deal with them are of high importance, but that's not what I was talking about.
Right now, there's simply not a lot of non-IT content for "Bookster", and that's the major problem.
Copyright law is bad: infoAnarchy · Pleasure is good: Origins of Violence
spread the word!
[ Parent ]