Process to fix the verse number on PDF to Docx conversion
Hi, I'm trying to create my Personal Book from a PDF.
This image below is my PDF. As you can see, it have the both chapter and verse number in it.
I converted it from PDF to DocX using Adobe Acrobat Reader PRO 2021 and I got a good output.
My only problem is that: the output joins/concatenate the chapter/verse number with the first word of the line.
So, for this example above, the output text is:
1.1Irmãos, pelas desgraças....
I would like to know if is there some way to keep the chapter/verse numbers, but avoid the concatenation process in the output text.
Could you help, me?
Thank you,
Find more posts tagged with
Comments
Sort by:
1 - 1 of
11
Hi Nycholas:
Do you have Microsoft Word 2013 or later available to you? It's possible it might provide a better outcome converting than Adobe. If your PDF is not under copyright, you could post it here and I could run it through Word and see if it works well.
macOS (Logos Pro - Beta) | Android 13 (Logos Stable)
Thank you Robert!
Yes, I have Microsoft Office 2019. So I converted my PDF using Microsoft Word and I got a better output!
Now I have the correct blank spaces between the chapter/verse numbers and the first word of the line. Great!
You can see it in Microsoft Word:
So, the next problem is that: When I imported this new *.docx file to my Personal Book in Logos 9, I saw that there is another verses when the original PDF concatenates the verse number with the first word of the line. As you can see below:
So, in this second case, the Word output is a concatened number/text too.
The problem, is when I tried to search this word in Logos, I need to write the concatenated number too. As you can see here:
So, Is there a way to solve/fix these cases where the original PDF verses are concatenated with the first word of the line?
Thank youm again!
OK, try this:
(Make a safe copy of the Word docx.) In Word, on the Home tab, click the Find & Replace button (at the right) and then click Replace. In the dialog box that appears, click the 'More' button at the bottom. Click the 'Use wildcards' box. In the 'Find what' box, type this ([0-9])([A-Z])
In the 'Replace with' box, type \1 \2 (that's a space between 1 and \)
(If your Portuguese character set has different 'first and last' letters, you'll edit that part.)
Click the 'Replace all' button. What this should do is find anywhere there is a number followed by an upper-case letter with no spaces in between and insert one space. If you want two spaces between verse and text, add another space. There are probably some verses that begin with a lower case letter; to fix those, run the Replace again with a-z in 'Find what'.
Examine your docx and see if all went well. The good thing about Find & Replace is that is does exactly what you tell it to do. The bad thing about Find & Replace is that is does exactly what you tell it to do. [:)]
macOS (Logos Pro - Beta) | Android 13 (Logos Stable)
Thank you so much for your good help Robert!
In the field 'Find What', I wrote: ([0-9])([A-zÀ-ú])
Because portuguese language have accent letters. It works great!
Also, Nycholas, I should add this, in case you don't know:
You'll be able to read your Personal Book you make from this as you would an ordinary book, and it will be searchable in the full-text index. However, you should also be aware that it won't behave as other bibles you might have from Logos, unless you do some tagging on it. If you are able to add the necessary tagging, you'll be able to use it in link sets with commentaries, Bible Search, Text Comparison, Copy Bible Verses, and a lot of other things I can't think of right now. It's a lot of work, but if this is a translation you value, it might be worth it. I have only tagged a bible personal book once (https://community.logos.com/forums/t/151799.aspx ) and I had to use MS Excel as an intermediate tool. If you need to do this, I'll be able to help.
macOS (Logos Pro - Beta) | Android 13 (Logos Stable)
I saw your bible work! Amazing! Thank you for sharing this MS Excel file.
My goal is create a Personal Book that any people that copy and paste a selected text from it, when paste, the output text cames with a good bibliography citation like this example below:
---------------------
"This is the copied text from the Nyck Personal Book."
Maia, Nycholas. Personal Book Name (2022). Chapter/Verse 10:20-30. Page 150.
---------------------
Is this possible for Personal Books?
Ahhh...One more question:
My original PDF have hyperlinks that are pointed to the same PDF (to the last page). As you can see in the image below:
When MS Office 2019 converted the PDF to *.docx, these links stop to work, even they are in blue color.
Is there a way to keep these hyperlinks working in the *.docx file?
Thank you again!