Merging even and odd pages into a PDF document on Mac OS X

Recently me and my father digitized two large books. My father did the bulk of the work by photographing more than 1200 pages. He first photographed all the odd pages and then all the even pages. As with any repetitive task, errors occurred and he missed a few pages.

All the even pages were in one folder and all the odd pages in another. The goal was of course to merge them into a single PDF document. If it wasn’t for the occasional missing pages this could have been straight forward. Just use Apple’s Automator to rename all the files. Automator allows you to give a bunch of files a base name followed by a serial number.

Screen Shot 2015-03-16 at 4.27.45 pm

The trick is to serialize the odd pages 1-1200 (e.g. drewes0102) and then the even pages in exactly the same way. This is possible since the even and odd pages are still in separate folders. Next you can use Automator again to add a suffix to the file names.

Screen Shot 2015-03-16 at 4.30.18 pm

Give the “a” suffix to the odd pages and the “b” suffix to the even pages. You can then move all the files into one directory. They will be sorted as:

drewes0001a
drewes0001b
drewes0002a
drewes0002b…

The last step is to either use Adobe Acrobat or Automator to merge the individual files into a single PDF document. For the automator option you first need to create a PDF document only for the even pages and one for the odd. The “Shuffling pages” options allows you in a third step to combine these two PDF documents into one.

Since there were certain pages missing this solution was not sufficient. If for example page 3 was missing then the sequence would be:

1,2,5,4,7,6

It would also be great if the book’s page number would correspond to the PDF document page number. Meaning that if you got to page 103 in the PDF file, you would like to see page 103 from the book. The solution was to include white dummy pages for the missing pages.

The following pages then all need to be re-serialized.  Meaning that you first have to move all the good page into a dedicated directory, call it “good images”. Add the white dummy pages with the the right serial number manually. You then rename all the remaining files in the original directory. I decided not to use the a/b suffix solution described above, but to re-serialize the files with an increment of 2. That way I could continue to look at each page scan and ensure that the page number in the scan was the same as its file name number. Jürgen Brandstetter was so kind to help me writing a small script to rename the files:

declare -i i=1; 
for file in *.jpg ; 
    do new=$(printf "%04d.jpg" "$i"); 
    mv "$file" "rename/drewes"$new; 
    i=$[$i+2]; 
done

In this script i defines the starting number of the renaming. The script searches for all the files that end in .jpg and renames them starting with i. In case of the missing page 3 it would have to be for all of the following pages i=5. It is also important to notice that a directory called “rename” needs to present in the image folder. The renaming is done by moving the files into this directory.

I created a simple text document and saved it as script rename_serial_odd.sh on the desktop. Use the Terminal to make that file executable with:

chmod +x rename_serial_odd.sh

You should then use the Terminal to get to the directory in which the files are that you intend to rename and that also include the rename folder. You can then call the script as:

/Users/yourUserName/Desktop/rename_serial_odd.sh

You need to complete this process for both the even and the odd pages. The advantage of this method is that you can always check the filename against the page number of the book. Once you complete the adding of dummy pages and renaming the files,  I moved the even and odd pages into one directory. The last step was to use Acrobat to merge all the files into a single PDF.

Academic Freedom

Distinguishing the signal from the noise is one of the main challenges in science. Scientists are trained to understand and judge the uncertainty in the world. We discuss our results and their limitations in articles and their merit is judged through the peer review process. Often these academic discussions have no immediate influence on the lives of the people around us, people who are not trained at interpreting statistics.

Italy has a tradition for lengthy legal proceedings and the recent overturn of six manslaughter convictions for Italian earthquake scientists is no exception. They were part of an expert panel discussing the earthquake risks for the south Italian city of L’Aquila on March 31st 2009. The citizens felt reassured and many decided to spend the night inside their houses. It is argued that 29 out of the 309 victims of the tragic earthquake in that night felt victim to this decision.

The six scientists and Bernardo De Bernardinis, who in 2009 was deputy head of Italy’s Civil Protection Department, were originally sentenced to six years in jail in an October 2012 trial. The trial caused an international outcry amongst scientists. How could we continue to discuss science in public when there is a chance that we could get jailed for it? How can we contribute to expert panels that advice policy makers? Earthquake scientists know and understand the uncertainty associated with their predictions. Could they be punished for the ignorance of their fellow citizens?

Judge Marco Billi justified his verdict by arguing that the panel had carried out a “superficial, approximate and generic” risk analysis. It is rare that judges participate in a scientific peer review process, but in this case it happened.

In the appeal court judge Fabrizia Francabandera acknowledged that the scientists could not have predicted the earthquake and overturned the original verdict with one exception. De Bernardinis received a two year sentence for his role in communicating with the public.

This is certainly a relief for many scientists, but the controversy around this case is directly relevant not only for New Zealand’s earthquake scientists, but for all scientists. For science to work we need to be able to make mistakes and to openly discuss our results. If society aims to prosecute us for the work we do for them then we better hire an army of lawyers. I think we could spend our money more wisely. We should invest into the scientific education of our students.

Given the two year sentence for De Bernardinis I will certainly be more careful when talking to the media in the future. I do not want to be held responsible for not warning the world about the upcoming robot uprising.

Die Chronik der Drewes

Together with my father I digitized the German book “Die Chronik der Drewes” (PDF File) by Hans Troebs. It is 2127 pages of part one and two. It has been a major effort to photograph and OCR the whole book. This book is about the family history of the family Drewes all across Germany. Here is the German summary:

Die Chronik der Drewes, Dreves, Drews, Drefs, Dreffs, Drebes, Drebs, Dreps, Drewsen, Drewis, Drevsen, Trebes, Trebs, Troebes, Tröbs, Troebs, Tröps, Tröbus, Trebst, Trübst, Troebst, Trebitz, Tröbitz, Trebesius, Trebus, Trebbus, Trebuß, auch Drees, Drebus, Dröbus, Trebuth, Trebbuth, Tributh, Trips, Treibs, Trebsdorf.

Eingebettet in die allgemeine Geschichte und eingebunden in das Leben ihrer Heimat, mit der Entstehung ihres Familiennamens aus namenloser Zeit plötzlich auftauchen und dann fortleben durch die Jahrhunderte bis zur weitgefächerten Verbreitung in der Gegenwart. Mit Einblicken in die Ortschroniken, Rückblicken auf die früheren Jahrhunderte, mündlichen und schriftlichen Überlieferungen, Stammfolgen, Lebensläufen und Lebensdaten sowie Wappen und Bildern von den Familien, von den Wohnorten, Häusern und Höfen, in ihrer Mannigfaltigkeit erforscht, dargestellt und herausgegeben.

It is very rare book and not even available on the second hand market. So we took the effort to make it available.