*Boing!*It looks like
As Tristan Davis, Senior Lead Program Manager for Word, explained: “With this functionality, you can transform your PDFs back into fully editable Word documents, rehydrating headings, bulleted/numbered lists, tables, footnotes, etc. by analyzing the contents of the PDF file.”These are not trivial tasks, since lots of PDF's don't have any special representation of these structures more complicated than positioning of text and font characteristics. I've been working on and off over the last year on a project to extract data tables from PDF documents, and there are no perfect solutions. Just detecting a table is hard enough, but describing its structure is also very tricky. Even if you have graphical grid lines on the page, there is plenty of room for idiosyncratic user representations. One of my colleagues is working on detecting section and list numbering, and even that is no picnic.
« Older Lisa Kristine, a photographer, gives a thoughtful ... | Kaseyama Co. makes clothes for... Newer »
This thread has been archived and is closed to new comments
posted by Egg Shen at 11:22 AM on August 15, 2012 [1 favorite]