Sunday, September 26, 2010

Saving weblinks to org-mode from Safari

Each day, I come across numerous web articles, blog posts, newsgroup posts, etc, that appear interesting. Often, I discover them while working on another task. To avoid distraction, I typically save their links for later review. Sometimes I drag the links to my desktop. Sometimes I bookmark them in my browser. Sometimes I send them to myself via email. Sometimes, I post them to my delicious account. It's time to admit that I need a better process.

In a prior post, I mentioned that I use Emacs's org-mode to organize my notes and tasks. I recently setup org-mode's Capture capability to easily record the deluge of thoughts that come at random throughout the day. While reading the documentation, I discovered the solution to my link saving woes, org-protocol. In particular, the ability to capture links to an org file directly from a web browser as demonstrated in this screencast.

Imitating the screencast on OS X turned out to be harder than I expected. So, I thought it worth wile to post my approach.

To begin, it's worthwhile to understand how org-protocol captures work. org-protocol is based on Emacs server which allows applications to use Emacs for text editing. A common practice is to have shells use an already running Emacs instance rather than starting a new one. This is accomplished by a helper program, emacsclient, that communicates with the primary Emacs instance. For org-protocol captures, emacsclient is launched with a specially formatted argument.

emacsclient org-protocol:/capture:/tname/http://foo.com

By advising the server-visit-files function, org-protocol detects such arguments and creates org-mode entries from them. A powerful template facility is provided to specify how the argument information is transformed into an org entry. Multiple templates are supported and selected by the argument's tname field. The remainder of the argument specifies the URL, in this case http://foo.com, and an optional note (not shown in the example).

Templates are specified by elisp code like the following,

(setq org-capture-templates
      '(("tname" "Link" entry 
        (file+headline org-default-notes-file "Links to Read")
        "* %a\n %?\n %i")))

This template, called tname, tells org to save the entry in the default notes file under the header "Links to Read" with the url as a sub-heading. This results in something like,

* Links to Read
** http://foo.com

See the Emacs documentation for all of the template facility's capabilities and options.

In the screencast, two "tricks" are used to have FireFox call emacsclient with the argument needed to capture the present page. The first is a bookmarklet that creates the appropriate emacsclient argument,

javascript:location.href='org-protocol://capture://tname/'+
      encodeURIComponent(location.href)+'/'+
      encodeURIComponent(document.title)+'/'+
      encodeURIComponent(window.getSelection())

The second trick takes advantage of the fact that the emacsclient argument is formatted as a URI. The topmost component of a URI is called the URI scheme. The standard scheme, http, represents HTTP hyperlinks and is handled directly by the browser. Many other URI schemes exist and most browsers support launching separate programs, sometimes called URI handlers, to process them. In the screencast, FireFox is configured to launch emacsclient when links with the org-protocol scheme are "clicked".

Duplicating this functionality with Safari on OSX turned out to be harder than expected. In an effort to make things easy, OSX provides the Launch Services API to automate the registration of URI schemes and their handlers. Unfortunately, a manual method isn't provided to specify new URI schemes and handlers. This means Safari can't simply be told to launch emacsclient when org-protocol links are "clicked".

While searching for a solution, The worg website led me to the org-mac-protocol project. Although promising, org-mac-protocol uses AppleScripts executed via a drop down menu. Call me lazy but I really like the bookmarklet approach. org-mac-protocol also provides far more functionality than I was interested in. I like to keep things simple.

A second look at the Launch Services documentation revealed that two mechanisms are provided to register URI schemes and handlers. Using a programmatic API, applications can, at execution time, register themselves as the handler for new URI schemes. Alternatively, applications can include the URI scheme and handler information in their application bundle property list. Of the two approaches, the property list approach looked like the best to pursue.

Property lists are used by OSX to store application and user settings. They're essentially Objective-C objects serialized to XML. Every OSX application contains a default property list in its application bundle that the Finder reads after certain events. While processing property lists, the Finder will register any URI schemes and handlers it finds with Launch Services. With this approach, all that is necessary to register a new URI scheme and handler is to edit an XML text file.

Unfortunately, modifying Emacs's application bundle plist didn't seem to be an option. I didn't see a way to specify the helper program emacsclient as the URI handler.

I knew from prior experience that AppleScripts can be packaged as application bundles. I reasoned that I could write an AppleScript to launch emacsclient, save it as an application bundle, and modify its property list to register the script as a handler for org-protocol URIs. I suspected that this had been done before and a quick google search led me to this stackoverflow thread.

Using AppleScript Editor and the stackoverflow thread as an example, I wrote the following script and saved it as an application bundle called EmacsClientCapture.app.

on open location this_URL
   do shell script 
      "/Applications/Emacs.app/Contents/MacOS/bin/emacsclient " 
      & this_URL
end open location

Next, I edited the script's plist at the path,

EmacsClientCapture.app/Contents/Info.plist

And added the following XML elements just before the final </dict></plist> tags.

<key>CFBundleIdentifier</key>
<string>com.mycompany.AppleScript.EmacsClientCapture</string>
<key>CFBundleURLTypes</key>
<array>
  <dict>
    <key>CFBundleURLName</key>
    <string>EmacsClientCapture</string>
    <key>CFBundleURLSchemes</key>
    <array>
      <string>org-protocol</string>
    </array>
  </dict>
</array>

I then moved the application bundle to the /Applications directory. This caused the Finder to read the property list and register EmacsClientCapture with Launch Services as the handler for the org-protocol URI scheme.

I added the bookmarket described above to Safari and viola! I can now click on the bookmarklet and save a link to the current page in an org-mode file. No more disorganization. Of course, it's now easier to collect distractions but that is a problem for a future post.