The distributors of e-books and the manufacturers of e-book readers wax long on the foolproof digital rights software that they incorporate to prevent users copying book files between each other. But the great strength of the electronic ink that e-books use - namely that it works by reflected light, making reading comfortable - is also a great weakness. You can put an e-book in an ordinary scanner and take the text straight off the screen.
Above is a scan at 600 dots-per-inch from an e-book. I ran it through gocr to extract the text:
Sherlock Holmes swallowed a cup of coffee, and turned hi_ attention to the ham and eggs. Then he rose_ lit his pipe, and settled himself down înto his chair. I'll tell you what I did first_ and how I came to do it afterwards_ '' said he. ''After leaving you at the st8_on I went for a charming walk through some admir8ble Surrey scene_ to a pret_ little village called Ripley_ where I had my tea at an inn, and took the precaution of filling my nask and of pum'ng a paper of sandwiches in my pocket. There I remained until evening, when I set off for Woking ag8in_ and found myself in the high-road outside B_8rbrae just after sunset. Well_ I waited until the road was clear-it is never a ve_ frequented one at any time, I fancy-and then I clambered over the fence into _e grounds. '' Surely the gate was open.! '' ejaculated Phelps.
That was with the program running its defaults and not tuned to, and taught, the e-book's font. Neither was it run through a spell-checker. If one did all that the result would be even better. Put a couple of wires into the e-book's next-page button and solder a MOSFET across them driven by one bit on the scanning computer's parallel port, and you could scan an entire book to text completely automatically...
When the software hackers haven't found a way around the security yet, there's always the hardware way. :)
ReplyDeleteThis post doesn't have enough 'like' buttons to mash.
The analog hole strikes again.
ReplyDelete