Extract and sort email addresses from text files with Automator
Author: willem In: apple, coding, productivity, toolsAutomator (Mac OS X’s drag & drop batch script-building tool) can be used to create applications that streamline repetitive tasks without any knowledge of programming.
Follow these steps to build an Automator workflow that will search all text files in a specified folder for email addresses, sort these email addresses alphabetically, and remove any duplicates:

Launch Automator from your ‘/Applications‘ folder and choose to create a ‘Custom‘ workflow.
Add an ‘Ask for Confirmation‘ action to the workflow and give it an appropriate name and description to prompt the user whether she’d like to continue.
Add an ‘Ask for Finder Items‘ action to the workflow to let the user select the folder her text files reside in.
Add a ‘Set Value of Variable‘ action and create a new variable to store the user’s selection for later use.
Add a ‘Run Shell Script‘ action, set the Shell to ‘/bin/bash‘, and set Pass input to ‘as arguments‘. Add this line into the body of the shell script:
grep -Eiorh '([[:alnum:]_.]+@[[:alnum:]_]+?\. [[:alpha:].]{2,6})' "$@" | sort | uniq
The above command will use the UNIX grep utility to perform a regular expression search for email addresses in the folder the user specified (which will be automatically passed into this action as ‘$@‘). The results from this search will be piped into the sort utility to be sorted alphabetically, then into the uniq utility to remove duplicates.
Add a ‘New Text File‘ action to the workflow, specify a name for the new file, set Where to the variable storing the folder the user selected, and Encoding to ‘Unicode (UTF-8)‘. Check the ‘Replace existing files‘ checkbox.
Add an ‘Ask for Confirmation‘ action and change the name and description appropriately to notify the user that the process has completed.
To save the workflow, choose ‘File->Save‘ from the menubar and specify a filename, then choose ‘File->Save As…‘ and set the File Format to ‘Application‘ to compile a standalone application.

Geekology’s example of this Find Email Addresses in Files workflow can be downloaded here, with a compiled version of the application available here.
Related posts:
Like this post? Subscribe to the Geekology RSS 2.0 feed!












Ted
August 15th, 2009 at 20:45
Awesome! Looked for hours trying to find a script that would extract email addresses from the bodies of 1000’s of emails in mail.app format. Presto! Even deleted duplicates on the fly. Thanks.
Daniel
December 1st, 2009 at 13:00
Remember to paste the code
“grep -Eiorh ‘([[:alnum:]_.]+@[[:alnum:]_]+?\.[[:alpha:].]{2,6})’ “$@” | sort | uniq”
as one line in the shell box. If you copy paste from the example, you get two lines of code.
for f in “$@”
do
grep -Eiorh ‘([[:alnum:]_.]+@[[:alnum:]_]+?\. [[:alpha:].]{2,6})’ “$@” | sort | uniq < – ONE LINE
done
all the best
Alex
January 16th, 2010 at 11:29
Thanks!
andrea
February 10th, 2010 at 00:09
Daniel- You’re tip really helped! Thanks
Jagster
February 22nd, 2010 at 00:41
This script saved me COUNTLESS hours!
Send me your email address so I can donate some $$$ to you.
– John
willem
March 4th, 2010 at 09:02
Haha, no need John, just tell your friends about Geekology if you think they might find it useful.