Make your own e-books
January 13, 2011 6:24 PM   Subscribe

Still clinging desperately to those reading-things of yours made from dead trees? While you're at it, scan the damn thing and make your own e-book. (My prediction is that there are copyright issues here that the manufacturer is ignoring, but that will come back to haunt them.)
posted by anothermug (48 comments total) 9 users marked this as a favorite
 
the ironically named Book Saver

The author appears unfamiliar with the phrase "out of print."
posted by exogenous at 6:35 PM on January 13, 2011 [1 favorite]


How does this adjust for different size books? It's not clear from the picture.
posted by unliteral at 6:38 PM on January 13, 2011


It is possible to do it yourself, previously.
posted by exogenous at 6:39 PM on January 13, 2011 [4 favorites]


Substantial noninfringing uses, dog. Book scanners already exist and copyright violations only happen when you scan a copyrighted work (and even then it could possibly be a fair use). There might be a lawsuit, but there won't be a good one as long as the manufacturer doesn't market them as piracy-machines.
posted by Grimp0teuthis at 6:39 PM on January 13, 2011 [2 favorites]


The National Post link has a bit more meat to it. It mentions that one still has to turn the pages manually, so it looks like it'd still be a bit of a PITA for anything other than occasional, small-batch use. And there's still the issue of imperfect OCR, of course. Not really seeing how it's so much more revolutionary than just using a flatbed scanner, honestly.

The National Post link also mentions that it might be economically feasible for a group of students to go in on one of these and use them to duplicate their textbooks, but to be honest people are already doing that with the aforementioned copy machines, to the point where pretty much any university-common text is available pre-scanned via bittorrent.

Neither solution can cope with the fact that more and more university courses use online materials which require a login key, which is either included with or sold separate from the accompanying textbook.
posted by Scientist at 6:43 PM on January 13, 2011


Not really seeing how it's so much more revolutionary than just using a flatbed scanner, honestly.

The use of a cradle means that you can scan books with little damage to the spine, which is important for older works (to get a good image on a flat bed scanner, it's often necessary to slice off the binding). THAT'S why it's called a Book Saver.

Yes, that means you have to turn the pages manually, but on the plus side you still have a physical book in the end.

Gosh, it's like you folks don't hang out on Distributed Proofreaders!
posted by muddgirl at 6:50 PM on January 13, 2011 [6 favorites]


The method is a very simple version of what Google seems to do (in the sense of open, book, use to cameras, click, turn page). Here's the company's website with more details.

Here's the problem - it's not the 1 second to take the picture, it's the 10 or more seconds to do all the mechanics to turn each set of pages. If you have a 400 page book, you need to take around 200 photos, which take 2000 seconds, which means at least a half hour or more of tedious work per book. If your libary has 100 books, well, now your talking a solid week of work just to scan the books - no to mention whatever else you'd do to clean up the files. I guess if it's worth it to you, then, well, there you go. Of course efficiencies come instead from infringing - just downloading what others scan, not scanning yourself. This isn't like ripping a CD or movie, which is automatic and painless.

All that said, I have some vague interest in this thing if it meant I could scan cookbooks for use on an iPad like device.
posted by Muddler at 6:51 PM on January 13, 2011 [2 favorites]


Yeah, I don't see a copyright issue for the manufacturer. I could see the publishing industry tyring, but they would lose.

A greater danger will be the first company that comes out with great WYSIWYG ePub generating software. OCR has been around for a long time.
posted by cjorgensen at 6:54 PM on January 13, 2011


But yes, if we're talking about pirating the latest Harry Potter book (or a mass-market cookbook that you've purchased and want to view on a private device), there's no reason not to go with the much quicker method that requires less processing - aka separating the pages and automatically feeding them through a flatbed scanner.
posted by muddgirl at 6:55 PM on January 13, 2011


Oh, I can definitely see how it's useful in certain applications and how it has advantages over using a flatbed, just not revolutionary advantages. Sorry for not being so clear. The manufacturers of this device (and those who are breathlessly paraphrasing their press-releases) are unequivocally heralding this as possibly doing for books what CD-burners did for music and DVD-burners did for movies. While I can see the BookSaver being a useful device with valid applications, I don't see it sparking any massive revolutions.

Now, if they incorporated a tiny robot to turn the pages for you...
posted by Scientist at 6:55 PM on January 13, 2011


Whoops, didn't mean to imply that scanning a book for private use was necessarily piracy. IANAL and in my little fiefdom that would be a privilege of ownership.
posted by muddgirl at 6:56 PM on January 13, 2011


Back when I was editing a monthly newsletter, I got a lot of type-written manuscripts in the mail (email & internet was in its infancy) and the OCR software I was using was so mistake-laden that it was scarcely faster to go through the scanned/converted file than it was to just re-type most things. Has OCR gotten a lot more accurate since '95?

Also, flatbed scanners are really, really slow for this kind of thing. Something that generates an image of two pages in 1 second is really pretty fast by comparison. The resultant file though, is it accurate & error free? Because I"m not going to re-proof-read a 200 page book.

Either way, it's probably still way too labor-intensive to cut significantly into any book or e-book seller's number, at least in the near future. I'd welcome it for rare, out-of-print & public-domain stuff, though. I just ordered a used, out-of-print book today, and had to settle for "good" condition, because "fine or excellent" copies were selling for multiple hundreds of dollars.
posted by Devils Rancher at 7:07 PM on January 13, 2011 [1 favorite]


Personally, I'd say that MP3 compression had a far bigger impact that just CD burners, at least for people who weren't just running illegal dupe farms. To make a pre-MP3 mix CD, I had to rip the tracks onto my hard drive in WAV format, then burn them to the CD , then delete them because that 750MB of uncompressed audio was clogging up my hard drive. It was a cool thing to be able to do, but it wasn't a revolution.

Anyway. From what I understand this isn't going to OCR the text in the books (which may be a blessing - OCR scans of out-of-copyright books on the Kindle Store tend to go slightly crazy around accented vowels) - just create a PDF document with each page as a page-sized image. So, that's going to make it less useful for creating e-Books for readers optimised for ebook formats, although fine for tablets.

As mentioned, you also have to turn the page manually each time - so even at a spread a second, a 360 page book then becomes a solid half-hour standing by this device, not really able to do anything else. Not necessarily a problem for a rare book or a cherished one, but a whole library? Ripping a CD collection to iTunes was bad enough, and that was one action per unit. I have thousands of books.

ION Audio make a range of devices for digitising analogue media - USB-connected turntables, USB-connected tape decks, that sort of thing, so are probably pretty confident on the legal front. Amazon might prefer it if you bought a Kindle version of your book rather than spending an hour scanning in your existing copy, but they can't do a lot about it, and it really feels like for a mass-market paperback it would be easier just to buy the eBook.
posted by DNye at 7:11 PM on January 13, 2011


If you ever get the opportunity to sit down with fake, buy the dude a jack and coke and get him to tell you the adventures of his DIY book scanner. It's not my story to tell, so I won't say any more about it, but it's a good one.
posted by carsonb at 7:19 PM on January 13, 2011 [1 favorite]


Will "tree books" become the perjorative equivalent of "snail mail"?
posted by Joe Beese at 7:27 PM on January 13, 2011 [3 favorites]


This article doesn't provoke much of a reaction from me. I saw a book scanner last year at the ARMA Canada Conference in London, Ontario and I thought it was pretty neat. It could save a scan to USB among other things - I believe it was an Indus model. I remember back in the day when I worked at the Mount Royal University library and so many books would be damaged from photocopying. It was sad. I also remember having to change toner in the copier and getting toner everywhere despite my best efforts not too. I see these machines as a good thing...and the copyright problem a knee jerk reaction. For private and company libraries, these machines are a boon.
posted by Calzephyr at 7:37 PM on January 13, 2011


We have two or three very hard to find books in our library which I would love to preserve somehow, beyond simply putting them in a shelf and never touching them again. I've been hesitant to scan them because of the demands that makes on the binding, and obviously don't want to cut them up just to get the pages to lie flat.

I don't think purchasing one of these for such a small number of books makes a lot of sense, but it's pretty damn tempting.
posted by hippybear at 7:38 PM on January 13, 2011


Oh, well here's a recent story about the DIY book scanner that fake's told online, one I hadn't heard before.
posted by carsonb at 7:39 PM on January 13, 2011


At 1 page per second. For the last book I read that's a mere 10 hours and for the current one just 5 hours.

The time of mine taken to backup a DVD or CD by mere is approximately 1 minute.

Books will be safe from this kind of thing for a while.

However, given the scans that show up around the place people either do this or they have access at some institution to automatic scanners.
posted by sien at 7:44 PM on January 13, 2011


Gah, woops, that's totally wrong. Apologies on a total math fail. At 10 minutes it's well possible.
posted by sien at 7:45 PM on January 13, 2011


Much easier to just chop the spine off and send the pages through a sheet feed scanner. Yeah you ruin the book but it's a small cost to the time and effort of that manual labor lift scanner design. Ok, maybe for one book or two I might do it for fun, but for dozens or hundreds of books no way, carpal tunnel and/or OCD.
posted by stbalbach at 8:00 PM on January 13, 2011


Has OCR gotten a lot more accurate since '95?

Yes. And faster.

At 1 page per second. For the last book I read that's a mere 10 hours and for the current one just 5 hours.

I know you corrected your math, but even at 10 hours I see this being a problem. Sure, you're not going to do your whole library, but if you do one book you can seed it and download all the others in your library.

I'm of the mindset that the digital copy of a book should be free with the purchase of the physical, but that's just crazy talk. I don't, but would have no moral qualms about illegally downloading the digital copy of a book I already own. Just as I have no qualms about making a digital copy of a CD I already own. Once I have ownership of the item how I choose to consume it is no one's business. Doing the actual labour of the scanning ripping is irrelevant.
posted by cjorgensen at 8:05 PM on January 13, 2011 [1 favorite]


WANT. I'm work in academia and I swim in books. I still have a lot of them there paper-like books, but I much prefer to have a digital copy to annotate and mark up. I do a lot of trans-local research, which means I travel a lot and often between continents, so the idea of being able to make PDFs of all the research books I currently need and then jetting off to my next research site with a thumb drive and/or web folder is really, really fucking sexy to me.
posted by LMGM at 8:58 PM on January 13, 2011 [1 favorite]


Seems like this is only news in the sense that it's a cheap consumer model. There have been commercial versions of this for at least a few years. I remember seeing a scanner that used a gentle vacuum to automatically flip the pages while keeping the book open at 45° to save the binding.
posted by Deathalicious at 9:14 PM on January 13, 2011


Here's the problem - it's not the 1 second to take the picture, it's the 10 or more seconds to do all the mechanics to turn each set of pages.

Most high-speed devices start by sawing off the spine of the book or magazine. Devices like this one are only used for books that are sufficiently delicate and/or unique that you don't want to turn the pages any faster than that anyway.
posted by GuyZero at 9:14 PM on January 13, 2011 [1 favorite]


One of the neatest things on scanning I've seen lately is Dan Reetz's ruminations on dewarping page images.

fake- is that you? awesome!
posted by zamboni at 9:27 PM on January 13, 2011 [2 favorites]


If I wanted an ebook, I'd have an ebook. I don't want ebooks. I want, you know, books.
posted by Justinian at 11:08 PM on January 13, 2011 [2 favorites]


Cut to the chase. Already available: 1000 fps consumer-grade cameras - admittedly not at OCR resolution, but 40 fps HD. Cheap computers: billions of ops a second. SF trope of advanced alien/android/idiot savant able to read book by riffling pages past compound eyes/Zeiss peepers/pebble spectacles in a second: not so far away.

Or cut to the fade: a scanner capable of reading a closed book. Or a library. I'm sure the TSA has paid some bunch of nogoodniks a whole pile of cash to fail at this already, while the Google Maps cars probably have the tech to download your bookshelves and rank your collection of gentleman's literature in the two seconds it takes the driver to sip a latte as they cruise past your crib.

In the end, everything that crawls upon the face of the earth and reflects the Lord's photons - at whatever frequency - shall be taken up and digitised. For so it is written.

Get on with it already, sayeth the Lord.
posted by Devonian at 1:12 AM on January 14, 2011


Makes me wonder about building a page turning machine out of Mindstorms.
posted by Goofyy at 1:35 AM on January 14, 2011


Cut to the chase. Already available: 1000 fps consumer-grade cameras - admittedly not at OCR resolution, but 40 fps HD. Cheap computers: billions of ops a second. SF trope of advanced alien/android/idiot savant able to read book by riffling pages past compound eyes/Zeiss peepers/pebble spectacles in a second: not so far away.

There are already protoypes that let you scan about 200 pages/min by flicking through the book in front of a camera.

Makes me wonder about building a page turning machine out of Mindstorms.
One such design
posted by James Scott-Brown at 2:13 AM on January 14, 2011


This, as with tapes, CDs and vinyl record to .MP3 or VHS to .AVI, is the ideal part-time job for the full-time WoW player. Just sit it beside the computer, and change it every time it beeps. Fourteen hours pass, you've put a level onto each of three alts and been in two guild raids, and you have a shelf of books done. Same again tomorrow.
posted by aeschenkarnos at 2:33 AM on January 14, 2011


"...it'd still be a bit of a PITA for anything other than occasional, small-batch use."
"...the problem ... it's the 10 or more seconds to do all the mechanics to turn each set of pages."


I recently digitized a bunch of cassettes and vinyl - 16Gb total. It took me all summer.

I copied them all to a thumb drive for my brother. It took me 60 seconds.
I copied them all to a thumb drive for my mom. It took me 60 seconds.
I copied them all to a thumb drive for my sister. It took me 60 seconds.
I copied them all to a thumb drive for my nephew. It took me 60 seconds.

Marginal cost: it's my bitch.
posted by klarck at 5:05 AM on January 14, 2011 [1 favorite]


Check the details guy, if you go to "documents" and check out the datasheet the document actually says one page per minute. Now where is the transcription error?
posted by knz at 5:10 AM on January 14, 2011


This would be invaluable for books that are in bad shape.

I once got a book from ILL that was one of the remaining copies of the minutes of the Royal Commission on VD in 1913. I have no idea on EARTH why they sent that copy out.

The book was so fragile that just opening it made some of the edges crumble, so I closed it, wrapped it in bubble wrap, wrapped it in a bag, taped the bag shut and took it back to be shipped back to them with a note saying they probably wanted to take it out of circulation, because it could not be opened without damage.

That book should be scanned before it crumbles away and future generations lose the chance to read about how our current morality-based arguments against treating disease has a long history.
posted by winna at 6:24 AM on January 14, 2011


The error is on the datasheet, I imagine. It also says:

While similar devices require up to seven seconds per one page, Book
Saver takes only one second per two pages!


Which makes a lot more sense than one page per minute. The seven seconds per page, I imagine, is for a scanner, which either runs a light over the page or runs the page through a light, whereas this takes photographs of the two pages and puts those two images into an ongoing PDF, which is then saved when you tell it by button press that you've finished the book, I imagine.
posted by DNye at 6:28 AM on January 14, 2011


If I wanted an ebook, I'd have an ebook. I don't want ebooks. I want, you know, books.
posted by Justinian at 1:08 AM on January 14 [+] [!]


I foolishly purchased the majority of my books before ebooks were practical or even existent. This device will be living in my house within a year of release.
posted by jtron at 6:49 AM on January 14, 2011


One of the neatest things on scanning I've seen lately is Dan Reetz's ruminations on dewarping page images.

fake- is that you? awesome!


It's most definitely me. :) Thanks!

Here's my take on the Ion. First off, it's not real yet, this is a mockup made of printed plastic. They claim some number of pages an hour to seven seconds a page. At seven seconds per page, it's no better than a flatbed. At that price point, the cameras in it can't be excellent. HOWEVER, I am willing to be surprised and if they do a good job with this obviously DIY-inspired scanner, it will bring scanning to the mainstream and the world will be better for it. I will be glad about it.

A cheap, commercially-made scanner does change the DIY build-your-own value proposition, but it doesn't change fundamentally interesting scanning issues like the dewarping problem. For those who aren't familiar, the dream is to take a picture of a curved page, and make the resulting page image flat, as though it had been perfectly scanned on a flatbed scanner. Meaning we could stop building these bulky scanner frames. My friend Rob, a mathemawizard and maker - no kidding, he builds mechanical computers, visited me in LA a month or so ago and really drove home some of the remaining problems in dewarping, from his own experience.

We're not the first to dream like this. To dream that we could just wave a camera at a book and the images would come out flat and beautiful. And fast. It's a hard problem. Google had problems with it. Snapter was a total failure. Most apps don't deal with it. But the app that my community helps with, Scan Tailor, is starting to get it right. The idea there is to figure out the page curvature by looking at the lines of text on the page.

Although it's possible to flatten out pages using only lines of text on a page, it's really hard and it doesn't work for whole classes of books. Every time I mention the dewarping problem, some programmer type says "that should be easy! just look at the lines of text on the page". Well, we're doing that. (but come on, anyone who reads books knows that not all pages have lines of text, and not all pages have lines of text that extend to the margins...). I'm convinced there are better ways.

So we need another way to find the shape of the page, so we can flatten it into a beautiful ebook. That's where my post comes in. I'm pleased to say that tomorrow I will be working on this design which uses lazz0rz to make lines on the page, and also the alternating-lighting design. If these ideas work out, and I have every reason to think they will, we will be DIY book scanning with ~$30 in parts as fast as you can turn pages. Which is pretty fast, especially if you're not doing any lifting or shifting of scanner parts like the Ion. Although honestly, it's pretty fast already. And that video is 2 years old - older than the whole project.

I really appreciate the support from MetaFilter. From those of you who aren't familiar with the good things that have come out of my community, who are not a bunch of awful book thieves, but rather a bunch of amazing people sharing knowledge and solutions and doing good in their communities, the article carsonb linked is a recent good one, and also this short talk I gave at Google talks about some of the amazing, non-infringing things people do with DIY book scanners.
posted by fake at 7:17 AM on January 14, 2011 [5 favorites]


Has OCR gotten a lot more accurate since '95?

As others have mentioned yes, but more importantly, drive space has gotten cheaper such that aside from indexing, it's really irrelevant to bother with OCR. A pdf of page images is dead simple to make and even a 300 page book ends up under 30mb. Hugely 'wasteful' compared to a .txt file, but you gain back some of the tactile bookiness that comes from seeing visual reminders of prior readers on screen.

If I wanted an ebook, I'd have an ebook. I don't want ebooks. I want, you know, books.

I've been building an Amazon wishlist of books I was waiting to come out on legit e-form, titled "Want in Kindle Form", separate from my regular wishlist. This year for Christmas my mom went through and bought me a physical book of most everything on that list.
posted by nomisxid at 8:19 AM on January 14, 2011


So we need another way to find the shape of the page, so we can flatten it into a beautiful ebook.

I can only imagine that there's some reason why this isn't tried, but why don't people use the page edges as reference lines? My first guess is that A) including page edges adds complexity to the later steps, as you'd have to ditch them for OCR, and B) on a book of any thickness that is scanned while bound, the edge is not "clean", as it were. Although the top/bottom edges might be OK. Anyway, it seems like you could slip a contrasting card just inside the cover or something.

I mean, I suppose not all page edges are parallel/orthogonal to the text, but it would seem like the exceptions are of the same magnitude of the exceptions to the assumption that all lines of text are straight.
posted by RikiTikiTavi at 8:35 AM on January 14, 2011


Has OCR gotten a lot more accurate since '95?

Yes. Well, mostly. But.

I've been working with OCR software for a while now, first in assistive technology (making textbooks accessible to students, etc) and then in academic digitisation projects. Just finished one big project that involved OCRing a huge amount of printed material since around 1700, with two stages of human proofreading after the OCR (even then, we reckon we achieved about a 98% word accuracy rate). OCR has improved, and it can do a lot more than it could in '95. But it's still a long way from being a magic tool where you feed in images at one end and get flawless editable type out of the other.

OCR is pretty good, when:

1) the book in question is new-ish: fairly standard font, no ligatures, no long-s characters. Better OCR software is trainable, so you can tell it "that group of pixels you can't make out is 'ct'" a few times and it'll learn, but even this will only work to a point; a long s tends to look too much like a lower-case f for the software to distinguish between them.

2) the book in question is in good condition: the more yellowed the pages and faded the ink, the more of a problem OCR software will have with it, and the same for marks, black mould, etc. on the pages. (It'll try to read anything that it thinks might possibly be a letter, even if that's the image of the gutter shadow or an illustration or a squashed spider in the margin.) Usually the OCR program itself will try to up the contrast between paper and type, or let you fiddle with that manually, but I've found much better results using programs like Imagemagick and Unpaper to manipulate the images beforehand.

3) the book in question is all written in something the OCR software understands. Quite often OCR software comes with an inbuilt dictionary, so rather than just recognising words on a letter-by-letter basis, it can take a guess at what a word might be even if it's having some trouble with one or two of the letters. This alternates between being incredibly useful and incredibly frustrating; the last project I worked on was a linguistic project looking at in Scots words, many of which the OCR software helpfully corrected to standard English if given half a chance. The word 'such' with a long s - 'ſuch' - became 'fuck' on a regular basis, and the amusement value wore off fairly quickly, I can tell you.

4) the book in question can lie flat on a scanner (or to be photographed), although methods of dealing with warp and curvature are improving.

So yes, it's good, but it's a long, long way off being at a point where I'd trust it to turn all the books I own into e-texts without any human proofreading stage.
posted by Catseye at 8:48 AM on January 14, 2011 [4 favorites]


One more question that I'm sure has been explored: is the depth information potentially gained from a *second* camera per page of any help? A quick search suggests that constructing a depth map from two images is difficult, but perhaps the constraints introduced by the book scanning rig can make it easier. (And maybe the constraints make it harder, since it looks like a bit of motion can help the mapping process).

I'm figuring that we'll just be flipping pages in front of our phones, eventually.
posted by RikiTikiTavi at 8:48 AM on January 14, 2011


You know, this is just one more of those things that make me feel all warm and fuzzy inside.

I'm 32, and spent all of my teenage years and most of my adult life as a technophobe. I remember reading about the Kindle 1 in Time, and being totally put off.

But before I even put the magazine down, I thought about every flight with my perpetually overweight carry on bag, always overstuffed with books. And about our 500 mile move in 2005- how my husband I mercilessly culled our 2000 books down to a mere 600, and yet we still had 27 big plastic storage bins of books to move, some of which weighed nearly 100#. A friend of a friend who got roped into helping us move, after his tenth trip down from our 3rd floor walkup, looked at us and said, "I'd like to tell you about a little invention called TV."

Okay, I thought, maybe. But I can buy books at the thrift store for a dollar

Then I thought of the bookshelves I coveted, that were made of solid cherry, held around 400 books, and cost $1000. A Kindle holds more books than that, I thought, and is cheaper. And I could carry my library in a backpack. But it was new, and pricey, and I just sort of forgot about it.

Then a year or so later, I got my hands on my techie uncle's Kindle 2. I saw how it was NOT a glowing rectangle like I expected, and how much content was available for free or 99c. I was hooked.

I love books. I love the way they smell, the way they feel in your hands, their heft, they way thick, dry paper feels, the way shiny smooth paper feels. I love illuminated script, tiny margin pictures, chapter heading illustrations. I love the look of a shelf full of books. But mostly what I love is what's in them. The corporeal aspect just is not as important to me as I thought it was.

I feel that this revolution is as important as the Gutenberg revolution. I'm sure there were purists who felt a type set, mass produced book was a soulless thing, with none of the virtues of a book lovingly copied and bound by hand, labored over by a spiritual man who took his duty as a burden and gift from God. And you know what? Those purists were wrong. The Gutenberg revolution heralded an explosion of literacy that changed the world. Among first and second world countries, even the very poor are mostly literate now. Literacy means information, information means education, education means greater freedom and a better chance at a happy, productive life.

(It's funny, that's what made a liberal, was deciding that purists were usually wrong. I was 8, doing a report on Abraham Lincoln, and decided that I wanted to get on the right side of history and stay there. And 22 years later, I still am happy with the logic of my eight year old self.)

I hope that this technological revolution heralds the last wave of literacy that includes nearly everyone. I hope the future holds a phone sized device with a projector that becomes a virtual keyboard, and another projector that becomes a virtual screen. I hope the phone sized device's real screen is something like that very promising Mirasol stuff that has battery life like eInk, and no glare, for reading. I hope there is worldwide, very inexpensive cell signal internet, to bank the unbanked, to provide health records and information, and hopefully to send so much information over great distances that we rely less on sending people and things. And I hope all these things are built by robots for $8 each so I can by 100 of them and send them to refugee camps.

It's so crazy, how cheap this stuff gotten just since I was a teenager. My cousin lives in a cabin with no running water, but they have a tethered smartphone, so she can work there. Amazing. If that's what the future is, houses with composting toilets and wireless broadband cohabiting, sign me up.

I know that people are afraid of what this is going to do to the livelihoods of those who work in publishing, and the massive roster of writers that the publishing industry enables to live a middle class life. I get that. I was a journalist at a humble little small town newspaper in my very early twenties, at the tail end of the pre-blog era. I saw what was happening, and it didn't make me feel good. It made me abandon journalism as my degree minor. But I think that this is one of things where the good outweighs the bad. Our economy that is based on so many professional middlemen cannot stand in a global economy. Especially in 50 years, when the Earth's population begins its inevitable decline, these capitalists systems just won't survive. We have to come up with something new, even if that new system isn't predicated on MP3 players and ereaders.

I hope that the publishing industry learns from the music industry and the telcos what NOT to do, but I fear that they won't. I was reading an interview with Lady Gaga (what is the name of this new law of the internet, that all discussions, if allowed to go on long enough, turn to Lady Gaga?), and she said point blank that people are welcome to download her music for free, that she makes money, buckets of it, by touring. She broke out and became wildly successful and very wealthy in the post-CD era, so clearly, it can be done. In a way, we have come full circle- now we are back to the traveling troubadour as far models for musicians to make money. Perhaps it will become the same for writers- maybe many of them will become features on the lectures circuit, or paid storytellers listed along side stand up comics.

TLDR: more ebooks = win!
posted by Leta at 8:49 AM on January 14, 2011 [5 favorites]


I can only imagine that there's some reason why this isn't tried, but why don't people use the page edges as reference lines?

Yep. that was the Snapter approach. Problem is that page edges (VS say, picture edges) are not always easy to distinguish from the background.

is the depth information potentially gained from a *second* camera per page of any help?

That was the Decapod proposal (but it is not in their current implementation). I think stereo correspondence is a good approach, but the depth resolution is not spectacular so it usually needs to be combined with other information.

Good thinking.
posted by fake at 9:01 AM on January 14, 2011


The copyright issues were mostly worked out a decade ago in the Diamond Rio lawsuit, weren't they? Format shifting for personal use is okay; so is selling equipment that helps someone else format shift. The DMCA complicates that by disallowing any such equipment that circumvents copy protection measures, but I don't think "OCR is hard" counts as copy protection.

There was another court case (I forget the defendant company name) where a certain method of helping someone else format shift (customer sends you a checksum of their CD, you send them MP3 versions of each track) was disallowed. That might be a problem for our hypothetical WoW-playing scanner supervisor, but the scanner manufacturer should still be in the clear.
posted by roystgnr at 9:19 AM on January 14, 2011


I'm of the mindset that the digital copy of a book should be free with the purchase of the physical, but that's just crazy talk.

It's not crazy talk, it's just throwing away free money if you're a publisher.

In principle, I agree, though. My life would be much easier and cheaper. Want.
posted by zeek321 at 9:25 AM on January 14, 2011


There was another court case (I forget the defendant company name) where a certain method of helping someone else format shift (customer sends you a checksum of their CD, you send them MP3 versions of each track) was disallowed.

In 2000 MP3.com lost a lawsuit brought by the RIAA for its my.mp3.com service which did exactly that. http://www.wired.com/techbiz/media/news/2000/04/35933
posted by GuyZero at 9:43 AM on January 14, 2011 [1 favorite]


So yes, it's good, but it's a long, long way off being at a point where I'd trust it to turn all the books I own into e-texts without any human proofreading stage.

Interestingly, I learned from reading Stephen R. Donaldson's excellent website that a lot of older books are converted to ebooks via scanning and OCR, rather than by using any kind of electronic file which the publishing company may have. This process isn't always correctly proofread, and he actually had to take at least one of his publishers to task for putting out such a shoddy product and then selling it as though it were ready for the public eye. He actually had to have his readers submit a legion of errors in the ebook versions to him so he could provide proof to Bantam/Spectra about how bad these actually were, because the publisher wasn't willing to just take his word for it.
posted by hippybear at 10:56 AM on January 14, 2011


Yeah, I've heard anecdotally that a lot of Amazon ebooks are from scanned copies of Real Physical Books, which makes little sense to me when they are modern books authored on computers. Good to see that substantiated by an author.

In 2000 MP3.com lost a lawsuit brought by the RIAA for its my.mp3.com service which did exactly that. http://www.wired.com/techbiz/media/news/2000/04/35933


I was on a panel with Michael Robertson at WFUD last year and he talked about his heartbreaking experiences with that. He is a seriously interesting and wickedly intelligent guy who is not afraid to throw money and extreme effort at a a problem. The "digital locker" issue was a big part of it. Wikipedia doesn't say much about it. More on that here and I'm still trying to find the video of his talk, which was excellent.
posted by fake at 11:17 AM on January 14, 2011


« Older "as you watch the video you will start to see...   |   Don't Peak, it's a Surprise Newer »


This thread has been archived and is closed to new comments