split by bookmarks file name convention

Help requests about pdfsam

split by bookmarks file name convention

Postby nestoru » Wed Jul 28, 2010 3:10 pm

First of all thanks for this great tool.

I would like to understand if it is possible to change the default file name that gets generated when the document is split by bookmarks.

I have a requirement which is to split based on first level bookmarks and depending on the text of the bookmark then merge certain result documents together still keeping the original bookmarks.

I noticed there is a bug that has been worked out so far in relation to the bookmarks viewtopic.php?f=1&t=799&p=3743&hilit=bookmarks#p3743 (basically they do show up when the files are merged back but they do not link to any specific page so when they are clicked you simply go nowhere)

But bsides that bug I want to understand if I can actually get the split documents with the name matching the bookmark (like http://sourceforge.net/projects/splitpdf/ does). Of course that is mandatory if I want to be able to regroup them.

Any help will be greatly appreciated. Thanks!

-Nestor Urquiza
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby Andrea » Wed Jul 28, 2010 4:32 pm

No, at the moment with pdfsam you can only split by level and there's no way to match particular bookmarks at that level. About the bookmarks handling when merging I'm working on it and it's not yet 100% fixed.
It's a nice feature anyway and I can try to add it with the new release, I check the project you linked to see if I can take a look at how it's implemented there but, even if it's on SourceForge I can't find the source code..
Andrea
----------------
User avatar
Andrea
Site Admin
 
Posts: 753
Joined: Tue Oct 31, 2006 4:11 pm
Location: Amsterdam

Re: split by bookmarks file name convention

Postby nestoru » Sat Aug 14, 2010 4:23 pm

Hi Andrea,

Thanks a lot for replying back.

I have posted the question there at sourceforge to see if we are in luck to get the source code https://sourceforge.net/projects/splitp ... dex/page/1

On the other hand is there a way to monitor this thread? I did not receive your post but rather I had to visit back the link to find out if anyone provided any feedback on this.

Cheers,

-Nestor
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby Andrea » Sun Aug 15, 2010 8:55 am

At the bottom of the page there's a link saying "subscribe topic".
Andrea
----------------
User avatar
Andrea
Site Admin
 
Posts: 753
Joined: Tue Oct 31, 2006 4:11 pm
Location: Amsterdam

Re: split by bookmarks file name convention

Postby nestoru » Mon Aug 16, 2010 1:28 pm

Here are my findings so far. The splitPdf author does not reply to the forum and the source code is not there however the license inside the jar file states it is distributed with Apache License, weird.

In any case examining the jar I found he used some iText-2.1.7 class that for some reason was later removed from the package but that is still available http://www.docjar.com/html/api/com/lowa ... .java.html

Let me know if it helps and thanks for taking a look at this.

-Nestor
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby nestoru » Fri Aug 20, 2010 6:18 pm

I am taking a look now at jpdfbookmarks. Check this out http://flavianopetrocchi.blogspot.com/2 ... ource.html

Looks pretty close to what you need for the split-by-bookmark feature I think, right Andrea?
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby Andrea » Sat Aug 21, 2010 11:31 am

well.. I think it's not a big to add a some sort of "bookmark name pattern match" when splitting by bookmarks. The second software you pointed is a sort of bookmarks editor and looks a lot more complicated then the original request. If I understood correctly you want to be able to specify a bookmark lever AND a name pattern, splitting at that level if the bookmark matches the pattern. This can be done I think without much effort and I'll try to add it to the next release.
BTW I realised that I didnt answer to one of your questions... here is how you can play with some variable to generate the output file name:

http://www.pdfsam.org/mediawiki/index.p ... xConfigure
Andrea
----------------
User avatar
Andrea
Site Admin
 
Posts: 753
Joined: Tue Oct 31, 2006 4:11 pm
Location: Amsterdam

Re: split by bookmarks file name convention

Postby nestoru » Sat Aug 21, 2010 4:26 pm

Those are great news as you are correct. Let me put in clear the tasks I need to perform:

1. Split a pdf file by a bookmark level.
2. Name each chunk with the name of the bookmark and include in each output chunk the original bookmarks that apply. For example: I select to split by level 2 bookmarks and the original pdf has the following bookmarks: General/Ants/habitats, General/Ants/types, General/Spiders/habitat, General/Spiders/types. I should end up with two files named Ants.pdf and Spiders.pdf. Inside each file I should have two bookmarks (types and habitat).
3. I decide to merge now Ants.pdf with Spiders.pdf. The resulting file should have 6 bookmarks: "Ants" and "Spiders" (level 1) coming from the name of the files and for each of them (as children bookmarks) I should end up with "habitat" and "types" (level 2)

Is that still what you are thinking to include in next release?

Thanks!
-Nestor
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby nestoru » Sun Aug 22, 2010 1:07 am

In regards to the file name you are correct with the "-p [BOOKMARK_NAME]" option I got that correct:

Code: Select all
./run-console.sh -f /Users/nestor/Downloads/pdf/20100722_dailystm.pdf -o /Users/nestor/Downloads/pdf/pdfsam_out -s BLEVEL -bl 1 -p [BOOKMARK_NAME] split


However the created PDF files do not keep children bookmarks.

Thanks!
-Nestor
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby Andrea » Sun Aug 22, 2010 9:52 am

I was thinking about point 1 and 2 + a pattern matching function. 1 and 2 should be already there (not fully working due to bookmarks handling bug) and I was thinking about adding a pattern matching feature like:
I have 3 bookmarks called "ants", "pants" and "XXX", if I decide to split a bookmark level 1 and I set a pattern matching like "*ants" it splits at first and second bookmark but not third.
Andrea
----------------
User avatar
Andrea
Site Admin
 
Posts: 753
Joined: Tue Oct 31, 2006 4:11 pm
Location: Amsterdam

Re: split by bookmarks file name convention

Postby nestoru » Wed Sep 01, 2010 6:55 pm

I did not get email notification when you posted. I will need to visit this thread by my own every other day so I do not miss your important feedback.

So you are saying that after the next release then I will be able to fulfill my use case. Right?

On the other hand I wouldn't extend pdfsam to accommodate for a possible pletora of pre-processing actions that can occur between the splitting and the merging. Those are concerns that can be easily achieved using any of the power unix tools, programming languages, scripting and so on.

For the scenario you are mentioning it is OK to split "ants" "pants" and "XXX" then later join just "ants" and "pants" from the external program/script which uses pdfsam.

On a side note the developer of splitpdf put his code on SF however as I said he is depending on an old version of itext and the code actually that handles the splitting merging is that version. For some reason that functionality was removed. As you said though we are fine with just pdfsam for this use case.

You have any estimates for the next release? If you commit to the trunk the necessary fixes I would love to give it a try for my sample above (split by bookmarks and then later join respecting the bookmarks)

many thanks!

-Nestor
nestoru
Novice
 
Posts: 7
Joined: Wed Jul 28, 2010 2:59 pm

Re: split by bookmarks file name convention

Postby Andrea » Fri Sep 03, 2010 7:54 am

No sorry, I don't know when I'll have something to release (I had a lot to do at work lately). I'll send you a message when I've something that can be tested.
Andrea
----------------
User avatar
Andrea
Site Admin
 
Posts: 753
Joined: Tue Oct 31, 2006 4:11 pm
Location: Amsterdam


Return to Help

Who is online

Users browsing this forum: No registered users and 2 guests