Sometimes a new feature comes along that leaves you unsure if it’s a good or bad development. This is one such development. A company by the name of Remote Approach has developed a system whereby PDF files can be tagged with the addition of some code so that it reports home every time someone opens the file and reports the IP address and other details, including any unique identifiers the makers choose to add, back to the author. They are also apparently working on a method of denying access to the PDF if the reader is not online at the time.
My concern is that this could become yet another tool for tracking users habits, and also that companies will start using the facility to require users be online to read ebooks, so that they can track piracy. Since ebook readers and other such tools are unlikely to be online most of the time, this could create serious usability issues. My laptop has been very handy for reading long PDF’s while sitting on a deck chair out in the back and out of range of the Wireless network. I’d hate to lose that ability due to a restrictive new tool. I’m also not convinced that such a tool should allow collection of IP address’s and other such explicit information about users as it increases the likely hood that in future such a tool might be used by unscrupulous types. This particular tool is subscription based and as such will be under the control of Remote Approach, but there is a good likelihood that the technology can be co-opted by people of less moral fibre and that worries me somewhat. Trends seem to strongly indicate that the days of the anonymous Internet are drawing to a close. As John Bielby of Remote Approach points out, such information gathering takes place already with Web server logs, but what they don’t mention is that web users can use an anonymizer service to hide their details from web servers if they chose to do so. No such facility is currently available for the new PDF system.
We have just been offered one suggested solution (along with a $30 donation for mentioning it through December 13, 2007) to the offline use dilemma is to use a PDF to HTML Converter. This would allow you to instead use the document in straight html that would be available offline. That suggests an intriguing solution. For a long time, Google has converted pdf documents to html, and I often use that method when searching to get a sense for what is in the document because of the relatively lighter download that I have to take. Unfortunately I have not been able to test able2extract because the donation didn’t come with a copy of the software. It suggests that unlike Google, it will convert images as well as text. Google mainly converts the text portion in my experience.
I’ll admit, I have yet to actually encounter one of these files that isn’t available offline, so perhaps we are thus far tilting windmills? If it becomes common place, then I’ll definately try something like able2extract.