Bulk convert HTML, RTF, etc. documents to PDF using the Mac OS X command line or an AppleScript
OS X has PDF printing support built into all applications, and this PDF support can be used from the command line or in an AppleScript script / application to convert virtually any kind of document to the PDF format.
In the example below there are five HTML files (file1.html, file2.html, …, file5.html) that have to be converted to a single PDF (all.pdf). To accomplish this on the command line, open up a new Terminal window, go to the folder the HTML files are stored in, and optionally execute this command to merge them into one file:
cat * > all.html
Next, call Apple’s ‘convert‘ utility (this is not the ‘convert’ utility related to ImageMagick) and specify the input and output filenames:
/System/Library/Printers/Libraries/convert -f all.html -o all.pdf
The ‘convert‘ utility will build the output file and write it to the folder your Terminal shell is currently focused on. The command to convert to / from other filetypes is similar, except that you can’t ‘cat‘ the input files if they’re not text-based.
This functionality can be automated by writing an AppleScript application that makes use of the command line tool. Enter the following code into Script Editor (found in the Utilities folder in your Applications folder):
on open input_documents repeat with this_document in input_documents set this_document_path to POSIX path of this_document --display dialog this_document_path do shell script "/System/Library/Printers/Libraries/convert -f " & quoted form of this_document_path & " -o " & quoted form of this_document_path & ".pdf" end repeat end open
Click ‘File->Save‘ to save the script’s source code, then ‘File->Save As…‘ and change the File Format to Application to save an executeable (application) version of the script. If you drop documents on this application, it will automatically launch and create PDF copies of the documents in their source folder.
Related posts:
- Change the OS X Network Location from the command line or an AppleScript application
- Resize Images Using AppleScript
- Mac OS X Quick Tip: Launch the screen saver with a bash script using AppleScript
- Mac OS X Quick Tip: Using Spotlight to search from the command line
- Editing, Validating and Querying XML with the XMLStarlet command line utility



07 Feb 2009 








author
If i use the above command for converting html to pdf , the generated pdf not formatted properly.
Please help me for generating pdf in required format.
Hey Murthy
If it’s a multi-page HTML document you’re trying to convert, maybe you should try using the “Save to PDF” feature available in all Cocoa apps (in the browser, click File->Print->PDF->Save as PDF).
Alternatively (if you’re trying to capture a webpage that’s not too image-heavy) you could use an application like LittleSnapper to grab a full-page screenshot, then open that screenshot in Preview and save it to a PDF.
Let me know if either of those suggestions work!
@willem
This is really a nice command! Thanks a lot for sharing it. Do you know of any documentation available for this command? I looked briefly but could not find anything quickly. I mean the help you get from the command itself is enough to get you going but still
Hi constantin
Sorry, the only available documentation seems to be the built-in help:
First, thanks so much for this very helpful seeming tip. I’ve messed around some with Automator, but I am a scripting newbie and I am scared of terminal so I am trying to do it in Apple Script, but I keep getting this warning, and then it not working:
convert: Unable to determine MIME type of, and then the path to the file I dropped onto it.
Can you tell me what I am doing wrong?
My ultimate goal is to be able to print mail .doc attachments as pdfs so they load faster for me to open them so I don’t need to be working in Word.
Thanks so much for any assistance.
Hey Jesse
When you say you want to convert mail .doc attachments, just keep in mind that you shouldn’t convert them directly in the “~/Library/Mail Downloads/”, else Mail won’t recognize them as the attachments on the messages.
First, save the attachments somewhere else (e.g. a folder on your Desktop), then use the above script.
I think the reason you’re getting that error message might be because it’s a Word document (.doc or .docx) that you’re trying to convert. The Microsoft Document standard is a closed standard and most apps can’t read it (in the case of applications like OpenOffice and Pages, their developers had to reverse-engineer the .DOC and .DOCX formats to be able to read and write them).
I think unfortunately you’re either going to have to convert the DOC / DOCX files to RTF first, then run the above scripts, or print the files directly to PDF when you have them open. :/
If the html file has an HTML Tag (image import), then the PDF produced simply has an icon for an image, but not the image.