Friday, July 22, 2011

A Busy Summer

The summer is proving busier than expected. In June, I started taking remote classes at Stanford in pursuit of a Graduate Certificate in Data Mining and Applications through the Stanford Center for Professional Development (SCPD). So far, I am enjoying my first class, stats202. As I suspected, I really like working with data to find hidden patterns, trends, etc.

At the same time, I have a project at work with an aggressive mid-August deadline. So I've been putting in extra hours. The good news is that the project is going well and I'm doing lots of R programming to analyze performance data. More fun working with data. I like R and am gaining a lot of experience with it through work and school projects.

Any spare time not spent with family is being use to work on a few hobby projects. I'm building a Zen Toolworks desktop CNC machine so that my son and I can make cool things. I bought an Arduino and Ultimate Microcontoller Pack from the MakerShed that I'm having fun with. If that wasn't enough, I'm slowly putting together a plan to build a tricopter.

A busy summer indeed.

Friday, May 6, 2011

TomTom Foolery

Three years ago I bought a TomTom ONE XL for a family trip. It worked great but I haven't used it much since then. We took another family trip a couple of weeks ago and I again wanted to take the TomTom. I decided to update its maps and thus ensued an unexpected adventure.

I bought an updated (and overpriced) map via the TomTom Home desktop application. Part way through copying the map to the device, it complained that it was out of space. A little digging revealed that there wasn't room for both the old and new maps. I looked for a way to uninstall the original map via the desktop but couldn't find one. Luckily, OSX mounted the TomTom as a FAT formatted disk so I just deleted the old map files. No more out-of-space problem.

I confidently re-started copying the map to the TomTom but ran into another problem. The TomTom suddenly disconnected. Repeated attempts all ended with the same error message that the USB device had unexpectedly disconnected. I tried different USB cables and ports but nothing worked. With the trip looming, it was time for some hacking.

I opened an OSX Terminal and used dd to write files with increasing sizes to the TomTom. It consistently disconnected when writing files larger than 100MB. Writing smaller files with intervening pauses seemed to work OK. Now that I knew what I could do, I turned my attention to what I needed to do.

I figured out that TomTom Home put the new map files at the path,

~/Documents/TomTom/HOME/Download/complete/map/USA_and_Canada/

The directory contained the files,

$ls -lh 
total 1790632
-rw-r--r--  jcardent  staff   7.7K Apr 13 16:25 USA_and_Canada-1.gif
-rw-r--r--  jcardent  staff   1.6K Apr 13 17:22 USA_and_Canada.gif
-rw-r--r--  jcardent  staff   2.6K Apr 13 17:30 USA_and_Canada.toc
-rw-r--r--  jcardent  staff   874M Apr 13 17:30 USA_and_Canada.zip
-rw-r--r--  jcardent  staff   305B Apr 13 17:22 activation.zip

Clearly, the file USA_and_Canada.zip contained the majority of the data. So I unzipped it and found,

$ls -lh
total 1790504
-rw-r--r--  jcardent  staff   252K Jan 10 11:05 USA_and_Canada-308.meta
-rwxr-xr-x  jcardent  staff    60B Jan 10 11:05 USA_and_Canada.pna
drwxr-xr-x  jcardent  staff   3.1K Jan 10 11:04 brand
-rwxr-xr-x  jcardent  staff   446M Jan 10 11:04 cline.dat
-rwxr-xr-x  jcardent  staff   113M Jan 10 11:05 cname.dat
-rwxr-xr-x  jcardent  staff   123M Jan 10 11:05 cnode.dat
-rwxr-xr-x  jcardent  staff    23M Jan 10 11:05 cphoneme.dat
-rwxr-xr-x  jcardent  staff    56M Jan 10 11:05 faces.dat
-rwxr-xr-x  jcardent  staff    30B Jan 10 11:05 mapinfo.dat
-rwxr-xr-x  jcardent  staff    92M Jan 10 11:05 poi.dat
-rwxr-xr-x  jcardent  staff    15M Feb 18 10:53 tables.dat
-rwxr-xr-x  jcardent  staff   4.9M Jan 10 11:05 tmccodes.dat
-rwxr-xr-x  jcardent  staff   129B Jan 10 11:05 traffic.dat

Three files over 100MB, oh bother. But there was no need to fear for I was armed with dd. I proceeded to use a command like the following to copy the large files to the TomTom in 50MB chunks,

dd if=./<map file> of=/<TomTom path> bs=1024 count=52428800 \
  iseek=<offset> oseek=<offset>

After an hour or so, all the data was on the TomTom. I disconnected and rebooted it only to get a "map not authorized" error. After a some curses, I recalled the other downloaded file, activation.zip. I unzipped the file, copied the contents to a couple of places on the TomTom - I wasn't sure where it belonged - and rebooted. Woot! The updated map worked!

I'm happy to report that the TomTom worked flawlessly for our vacation.

Moral of the lesson, know and use your UNIX command line tools.

Friday, April 29, 2011

Diving into R

I've wanted to learn R for a long time. A new project at work is providing an ideal opportunity to finally use it. So far, it's been a great experience. R is an incredibly powerful tool for data analysis. It's allowed to me dive deep into the project's data and automate much of the analysis process.

Programming in R has been easier than expected. I've previously programmed in Matlab which has helped greatly. Some of the concepts are still foreign but I'm confident that they will become less so with time.

The greatest joy has been getting "lost" for hours writing R functions to analyze the data and produce reports. R's interactive interface has made it easy to build up code in an exploratory manner. This is my preferred programming methodology that, I find, allows me to stay in a flow state for long periods of time. The experience has been very similar to programming in Lisp dialects which I also deeply enjoy.

Although there is a lot of good information about R available for free on the web, I've found the following O'Reilly books the best resource for coming up to speed quickly,

A particularly powerful library is ggplot2 by Hadley Wickham. With it, I've been able to create very complex graphs and charts with minimal code. ggplot2 uses a grammar to create graphics in layers that, at first, can be challenging to learn. The website is informative but the book has been the best resource and well worth the money.

Another useful library is brew which I am using to auto-generate pleasant looking reports in PDF via LaTex.

I look forward to working more with R. Data science is a growing interest of mine and this opportunity to use R is adding to the momentum.

Monday, April 11, 2011

Book Review: Final Jeopardy

Final Jeopardy: Man vs Machine and the Quest to Know Everything by Stephen Baker

I found the Watson exhibition very exciting. I was therefore eager to read Baker's new book, Final Jeopardy, that accounts the inception of IBM's Jeopardy Grand Challenge and the software team that completed it by creating Watson. Although light on technical details, the book provides a good overview of the primary challenges. It also discusses the non-technical issues that the Watson and Jeopardy teams struggled with in staging the man-machine competition. Overall, a very good and enjoyable book. If you enjoyed Baker's Numerati, you'll probably enjoy this book too.

The next challenge is to create a computer that can write Jeopardy questions rather than just answering them.

Monday, April 4, 2011

Seymour Cray Videos

I've long admired Seymour Cray as the genius behind early super computers such as the CDC6600, Cray-1, and later Cray systems. However, I know little about Cray himself. So, I was happy to discover two YouTube videos of Cray speaking about his career and systems.

In this 1976 talk, Cray describes the design of the Cray-1. Among other topics, he describes the factors that gave rise to the Cray-1's iconic shape.

Thirteen years later, Cray discusses the design of the Cray-3 and Cray-4 systems in this talk and his decision to use Gallium Arsenide, then a leading edge material. I wasn't aware of the three dimensional modules used in the Cray-3. Cool stuff.

I enjoyed both talks. Cray was much more personable than I expected. He was very humble and claimed ignorance in a number of areas related to computing. It was refreshing to see someone of Cray's caliber display these characteristics.

It was amusing to see that the fundamental problems of building computing systems have remained the same for decades: speed, size, and power. The more things change, the more they stay the same.

Sunday, March 6, 2011

Quants: The Alchemists of Wall Street

Last week, I stumbled across a good documentary by VPRO on quantitative analysts. It features a couple of famous "quants", Paul Wilmott and Emanuel Derman, as well as Michael Osinki who wrote the software used by many banks to securitize mortgages.

The documentary discusses the challenges associated with financial modeling. For example,

  • Many models were based on limited historical data that was insufficient to represent macro-economic swings.
  • Many executives did not understand the technical aspects of financial modeling and were therefore unable to recognize the associated risks that led to the subprime crisis.

I strongly agree with Paul Wilmott on the following (paraphrased) point,

People that take risk should be compensated. But they should not be compensated for taking risk with other people's money.

Here here. Wilmott is extremely impressive. When the subprime crisis hit, I was surprised to find out that he had been warning against model related risks. Given his high regard in quant circles, I'm surprised his warnings were not better heeded.

Saturday, February 26, 2011

Commemorating Discovery's Last Launch

In commemoration of the Space Shuttle Discovery's last flight, I decided to post a link to the YouTube videos of MIT's Fall 2005 session of Aircraft Systems Engineering (16.885J). The course was co-taught by ex-shuttle astronaut Jeffrey Hoffman and ex-NASA official Aaron Cohen. It featured many guest speakers from the Shuttle program who went into a lot of technical detail about the system's design and operations.

All of the videos are good but my favorites are,

Whenever I need a hard-core technical fix, I watch one of these videos. Works every time. These were real engineers.

Other materials from this course are available on MIT's OCW website.

Thank you Discovery for twenty seven years of service. It's disappointing that the space program is returning to rockets. It's just so 20th century.

Rickards on Global Sovereign Debt

It's one thing to read tin-foil hat blog posts about the impending collapse of the global economy. It's another to watch the same message from James G. Rickards, a (seemingly) intelligent and credible source.

I liked Rickard's approach of using dynamic systems theory to analyze global markets and economies. The technique changes the conversation from a philosophical one (i.e. Keynesian vs. Monetarists vs. Austrian) to a scientific, empirical one.

To summarize, Rickards says we're screwed. Decades of deficit spending and the bailout have led to the accumulation of an unsustainable amount of debt - no combination of growth and taxes that can satisfy the liability. Rickard's makes the point that this debt can't magically disappear, it has to be flushed from the system either through default, inflating currency, or debtors and creditors going to war. This sounds sensational until Rickards explains how this happened multiple times in the 20th century alone.

Scary stuff but worth watching. Wishful thinking isn't going to fix the current economic situation. I much prefer a data based approach to understanding and solving the problem. I hope Rickard's is wrong but, if not, I hope someone is listening.

Friday, February 25, 2011

Shared Keyboard Madness

I'm not alone in my mechanical keyboard madness. Arstechnica posted a good video about ergonomic keyboards. Their recommendation? Mechanical switches and a split layout.

I'm really enjoying the Kinesis Contoured keyboard. The wrist pain I was experiencing has disappeared. I only wish I had a second one for home as it is now uncomfortable to go back to using the MS Natural Ergonomic - the rubber dome keys require a lot more force.

Sunday, February 20, 2011

More Watson

There is a lot of good information available online about Watson. Many of my friends have wondered about Watson's wagering algorithms so I was happy to find this blogpost and video by IBM research. I find Watson's avatar endearing, this video on its creation was fun to watch.

Part of me wonders if (read hopes that) Watson will inspire a whole new generation of AI researchers.

Thursday, February 17, 2011

Watson Wins!

Well, Watson won Jeopardy!. It's an amazing feat. I look forward to seeing where IBM takes the technology.

NOVA has made their episode on Watson's creation free to watch online. IBM has posted a YouTube video of a talk by Dr. David Ferrucci on the overarching DeepQA project. Ken Jennings, one of the human contestants, gave an entertaining live Q&A.

I would love to write software for a system like Watson. As a kid, I was fascinated by fictional super-intelligent computers like HAL, WOPR, and MCP. Over the past year, I've rediscovered a strong interest in data mining and machine learning. While I've helped develop server and storage systems capable of hosting such applications, I have not yet worked with these technologies directly. This is something I'm considering as part of my annual career planning.

Friday, February 11, 2011

Andrew Lo, Kill the Quants

A good talk by Andrew Lo, Director of the MIT Laboratory of Financial Engineering, on the merit of blaming quantitative methods for the subprime crisis.

Lo main assertion is that,

"blaming quantitative methods for the financial crisis is like blaming accounting and the real number system for accounting fraud"

Instead, he suggests blaming the people that used the methods inappropriately rather than the methods themselves. From what I've read of the subprime crisis, I agree.

Lo partially demystifies the subprime crisis by using a simple example to explain collateralized debt obligations. He demonstrates how pooling loans and securitizing them into new bonds in multiple traunches can produce both higher and lower quality investments. He explains that it was securitization that allowed subprime loans to find their way into low risk funds like pensions and money markets. He also demonstrates how the method fails when the underlying loans are highly correlated.

Lo then discusses the crisis preconditions that Charles Perrow puts forth in his book "Normal Accidents",

  • Complexity
  • Tight Coupling

To this Lo adds an additional precondition,

  • The lack of (frequent) negative feedback

Lo asserts that under these conditions, human behavior naturally leads to crises like the one in 2008. His points greatly reminded of those made by Richard Bookstaber in his book "A Demon of Our Own Design". I found Lo's argument compelling. Without frequent negative feedback, conditions build to a point where proactively unwinding them becomes impossible - the cost is too great to actively decide to incur. The cost inevitably comes but without conscious action.

In closing, a great talk worth watching.

Saturday, January 15, 2011

More Keyboard Madness

Last March, I posted about a growing keyboard obsession and decision to buy a Filco Majestouch mechanical keyboard. Well, the madness continues. I am now the proud owner of a Kinesis Contoured Advantage.

The Filco is a high quality keyboard. The Cherry Blue switches feel great but are indeed noisy - not ideal for late night hacking at home. Actually, even day time hacking was frowned upon by my family. Luckily, I got a closed door office at work which allowed me to use the Filco all day without driving others crazy.

Unfortunately, I started experiencing wrist pain and the evidence pointed to using the Filco. The ergonomic people at work set me up with a keyboard tray to promote better hand positioning. That helped but I felt "trapped" in the "correct" position. Not only did I still have wrist pain, I was even more uncomfortable.

Frustrated, I switched back to using a Microsoft Natural keyboard. The pain subsided but the rubber dome keys felt awful after typing on the Cherry switches. The extra force required to press the keys was noticeable and tiring.

I considered pre-ordering a Truly Ergonomic keyboard but I didn't want to be an early adopter. I'd rather wait until there are many reviews.

I mentioned in the March post that I've long wanted a Kinesis Contoured Advantage keyboard. They look really cool but, more importantly, have many positive reviews online. I decided that it was finally time to try one.

I've been using the Kinesis for a couple of weeks and really like it. The Cherry brown switches feel as good as the Filco's blue switches but without the noise. I really like the thumb keys and use my pinky fingers much less. I remapped the CTRL and ALT keys to the same positions on both thumb pads. Using Emacs and org-mode is now easier and faster. Most importantly, the wrist pain is fading.

It didn't take as long to adjust to the Kinesis layout as I feared. After a couple of hours, I was typing at my usual rate. Switching back and forth between the Kinesis and regular keyboards does take some adjustment but it appears to be getting easier.

The Kinesis may well be my ideal keyboard. Now I'm wondering when I should consider switching to a Dvorak layout. A challenge for another day.

Sunday, January 2, 2011

Where's Johnny?

I've been neglecting this blog lately. Instead of posting, I've been spending my rare spare time exploring multiple interests. Specifically, clojure, data science, and quantitative investing.

I'm hoping to combine these interests and post about them in the near future.