-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing/modifying pdfs #56
Comments
Both are definitely possible while first one can fit into the purview of PDFIO, the second one can be developed as a separate project that utilizes capabilities if PDFIO. PDFIO is a low level PDF reading (can be extended for manipulation) API. There is no plans to move it to the realm of machine learning or NLP or document structure understanding. |
I agree. What should be done to support writing pdf files? Is that a large undertaking? |
For the list given 3-6 man months depending on how much you understand PDF specification. Many of the things need document understanding which can be excluded from the list. More than development, good PDF parsers have to tested with variety of file types. That can be overwhelming. |
Unfortunately, I'm not very familiar with the PDF spec. What is the bare minimum that needs to be implemented just to write pdfs? |
@kskyten unfortunately, without understanding the PDF specification it will be hard to write a writer particularly when you are looking at modifying page content. Moreover, writers require compression encoders which are not integrated to PDFIO only decoders are currently integrated. Personally, writer is not very high on my priorities. While I can guide as a maintainer and owner of the library, I cannot commit on any implementation work myself. |
I was hoping I would just be able to copy the unmodified streams over and modify the lengths and references to make it work. I don't think I need a full-blown writer as I only need to modify a specific subset of streams, but I might be wrong. |
Is it possible to modify the parsed pdf and write it to a file? Specifically I'm interested in the ideas from here: open-source-ideas/ideas#46. Julia has excellent support for neural networks, so it would be interesting to experiment with something like this.
The text was updated successfully, but these errors were encountered: