The “ready” pattern in Javascript, distilled

In my daily Javascript travails, I frequently need to initialize an object asynchronously. But if it’s asynchronous, I can’t call some of its methods right away! Backbone and jQuery help with this problem by letting you assign events and custom triggers.

With Backbone, you could do something like myObject.on(“ready”, function(){ … }), then call myObject.trigger(“ready”) somewhere else. But that doesn’t cover my use case. If I call myObject.on(“ready”, …) after the ready trigger, then nothing will happen. I’d prefer if it fired right away!

I made my life a lot easier by making a mixin to handle this. You can check it out on github. I wrote it in coffeescript, but included the javascript in case you don’t want to compile.

Here’s a mock of how I might use it with Backbone:

Video = Backbone.Model.extend({
  ready: readyMixin,

  // this initializer isn't ready until later
  initialize: function() {
    somethingAsync(function() {
      this.videoElt = document.getElementById("my_html5_video");
      this.ready(true);
    }, this);
  },

  // internally use ready() so that play() can be called before 
  // the video is actually ready.
  play: function() {
    this.ready(function() { this.videoElt.play() });
    return this;
  }
});

myVideo = new Video().play();

It handles nested ready() calls just fine, and also lets you pass in arguments after the callback. You can sugar up your syntax too by passing the function name as a string.

// This works just fine.
myVideo.ready(function() {
  this.ready(function() {
    this.play();
  }
});

// Passing arguments.
myVideo.ready(function(pauseOptions) { this.pause(pauseOptions) }, options.forPause);

// Sweet sugary goodness, doing the same thing as above
myVideo.ready("pause", options.forPause);

// I could've also done this:
myVideo.ready(this.pause, options.forPause);

Blazing Fast Local DNS in Mac OS X

Ever notice how mind-numbingly slow it is to look up your local server on Max OS X lion? This thing is running locally with no network latency–it should be blindingly fast!

This stackoverflow post has some good information:

http://stackoverflow.com/questions/6841421/mac-osx-lion-dns-lookup-order

And this post describes exactly what’s happening with .local:

http://www.thursby.com/local-domain-login-10.7.html

Apparently, from 10.6 to 10.7 they changed the host lookup rules. If you’re looking up a normal TLD (e.g. .com, .org, .net, or even .local), then it first checks the explicit nameserver, which is usually your modem, which forwards the request to a real nameserver of your ISP’s choosing. If it doesn’t find a match, then it falls back to your /etc/hosts definitions. Moreover, it doesn’t fail just once, but twice! If you don’t have ::1 as your search domain, it will issue an additional IPv6 request when the IPv4 one fails. I’m guessing this is a security measure, so they can detect phishing attempts even when /etc/hosts is compromised.

All that is to say, if you’re using a normal TLD or .local, then you’re making 3 DNS requests instead of 1, and 2 of them are not local.

So how can you make sure your local DNS entries are near instantaneous? There’s a simple answer in that stackoverflow thread: Simply use a non-standard TLD. It will go straight to lookup in /etc/hosts.

But if you think using non-standard TLDs is evil, you can set up a local instance of dnsmasq. I followed these directions to get it set up:

http://blog.philippklaus.de/2012/02/install-dnsmasq-locally-on-mac-os-x-via-homebrew/

That guide is slightly out-of-date: the package changed from uk.org.thekelleys.dnsmasq to homebrew.mxcl.dnsmasq, which affects the launchctl lines. But you’ll get the proper command from homebrew after it installs. Then you can either manually launch dnsmasq or restart your computer.

In /usr/local/etc/dnsmasq.conf, I use the following:

local=/max/
#address=/double-click.net/127.0.0.1/

That intercepts all *.max requests and points them directly to /etc/hosts. If I had any hostnames that absolutely must end in .com or .org, I would use the address syntax above.

Getting the data to disambiguate word senses – Part 1

When interpreting a phrase, we very often need to decide on a word sense. For example, if I say “he plays the bass”, then “bass” is probably referring to music. If I say “he caught a bass”, then we’re probably fishing. And if I say “he was slapped with a good bass”–well there’s not really much context for that and I’m not sure what people will think. We usually decide subconsciously, remarkably fast. Only in the latter case do we stop for a minute and go, “wait, what?”

The current state of NLP is not terrible at making decisions. Using data on co-occurrences, or how frequently two words occur near each other, we can be intelligent. For example, “play” is near “bass”, so that’s most likely talking about a bass guitar. It’s a pretty simple concept with some complicated math behind it.

The tricky part here is associating a co-occurrence with a word sense. There have been some great innovations in this area so that we can use data we already have.

The trouble with all of those is that they rely on “dead” data. Corpus statistics are for a document (albeit a very large document) that isn’t changing. Words are tagged by hand and the statistics are generated infrequently. A Thesaurus is similar, with its associations curated by just a few people. Words change and die. World (or local) events affect human thought and language. A few people can slowly document the associations in a dead language, but they will always be hopelessly behind on culture.

That’s why AI and NLP researchers are eagerly awaiting The Semantic Web. The idea is basically to turn the entire internet into a knowledge graph by Manually. Tagging. Everything. It’s an interesting concept… but people won’t do it. Okay, maybe some people will do it–the ones who care about SEO or academics. But your average blogger or YouTube commenter doesn’t care. Thus the data will be skewed and incomplete.

In an ideal world, we’d have highly structured (and therefore searchable) data, automatically updated with the entirety of the internet. If people won’t do it for us, how can we coax them into it?

I’ve got some ideas already, but I’ll be thinking about it the next couple of days. I’ll get back to you on possible solutions.

Staying healthy at a startup

If you work at a startup, your primary goal is (or should be) “get shit done.” Anything that slows you down is the enemy. For a programmer, this often means forgetting to eat because you’re so focused. Or you might stay up til 4 A.M. because you’ve got pizza cramps, and you’re wired on Red Bull, and this webapp isn’t going to code itself. If you get sick, you should just power straight through it. It doesn’t affect your job, right?

Lack of food and sleep is one thing, but that’s not the focus of this post. Preventing chronic sickness is.

I think a lot of driven people have this problem. A cold makes you feel shitty, but it doesn’t totally stop you from working. So it can be tempting to pop some vitamins and dayquil, blow your nose, and head into the office.

Don’t do it. I’m serious. This post isn’t meant to be accusatory. It’s a public service announcement. It’s a note to myself as much as anyone. Do not go into work sick.

This is particularly relevant because I’ve had a cold or virus, oh, 3 separate times in the last 3 weeks. And I went in a few times when I probably shouldn’t have.

Let’s do a comparison, shall we? Here’s what will happen if you go to the office sick:

  • Your performance will be subpar. Have fun picking up the pieces later.
  • You will prolong your sickness, and possibly make it worse.
  • You will infect one or more co-workers. You inconsiderate jag.
  • You will make your team annoyed because you’re an inconsiderate jag.

And here’s what will happen if you stay home:

  • You will miss a day or two of work. Deal with it. Check your email every couple hours if you’re so addicted.
  • You will start feeling better soon.
  • You will prevent multiple co-workers from getting sick. Their time is important too.
  • Your team will be happy with you because you’re a considerate human being.

Here are some other notes I have that might come in handy. Some might say they’re common sense, but we all cut corners sometimes.

  1. If you feel like you might be getting sick, head home immediately. Take no chances.
  2. Cough and sneeze into the crook of your elbow, even if you think no one’s around.
  3. Don’t re-use public utensils or dishes. That shit needs to be washed and sanitized.
  4. Speaking of washing, make sure you have fairly new sponges. They get real gross real fast.
  5. Sponges should be sanitized before use. Microwave them!
  6. And speaking of gross, you better not be sharing a towel in the bathroom. Use disposable paper towels or air jets, my friends. This guy will show you how to use paper towels and conserve at the same time!
  7. If you think something might be contaminated and it’s disposable, throw it out. If it’s not disposable, sanitize it immediately.
  8. Keep sanitizer (I like mine without triclosan!) in a few different places and make sure people know about it.

In the fight for getting shit done, it’s important to stay healthy. Even though you might be focused on a project, don’t cut corners. Cover your mouth, wash your hands, and be considerate. Working through an illness is snake oil. When you’re sick at the office, every step forward is two steps back.

Be smart. Stay healthy. Get shit done.

Internet Explorer not ready for classid attribute

The other day I was constructing an SWF embed on the fly with DOM elements (as I often do), and I ran into this weird bug in all versions of IE. Who’s surprised? Any code on the page that required jQuery’s ready function to fire was hosed. The loading icon kept rotating as if the page was waiting for something, but there was nothing going on in the network tab.

I narrowed it down to these lines:

var objElem = document.createElement("object");
objElem.setAttribute("classid", "clsid:D27CDB6E-AE6D-11cf-96B8-444553540000");

If you execute that javascript before the DOM is ready, the page will hang in all versions of IE. Ready/Load events are not fired at all.

Here’s a super simple test case I put together:

http://dl.dropbox.com/u/124192/websites/iebug1/index.html

What the hell.

The only solution I know of is to make sure you don’t run that code before the document is ready. Yikes.

Soon after posting this, I found some other sites that explain the issue in greater detail. Here’s a better look from pipwerks. What doesn’t seem to be covered there (or anywhere else for that matter) though, is that the page will not hang as long as you run that code after the document is ready. Then again, it will probably still fail when you append it to the DOM. But I was just using it to get a native HTML string.

CSS Injection with Javascript!

inject-css

I’m happy to announce the release of inject-css! It tries to solve the specificity problems encountered by embedding your CSS on a 3rd party website. The plugin has been half-complete for a few months. The latest version works without jQuery but also provides a simple jQuery interface.

I first came up with CSS injection to harden the styles of the Wistia Socialbar. The socialbar was always embedded into someone else’s site, which meant it frequently experienced CSS collisions and specificity issues. My task was to find a way for other sites not to mess up our styles, and vice versa.

How about iframes? Not this time.

The most resilient solution is probably just to put it in an iframe. But the social buttons behave much better if they’re on the correct page, and let’s face it: an iframe isn’t always an option. Sometimes you want your javascript to have access to the page.

Inline everything! Just kidding.

Forgetting the iframe, my next thought was, “let’s just set all the CSS as inline styles.” Then my face screwed up and I coughed a lot. Besides being ugly as sin, inline styles create other problems. Any CSS pseudo-classes would need to be reworked to be dependent on javascript. We wouldn’t be able to use certain nice features of cascading style sheets such as, for example, cascading styles. Plus it’s just straight up ugly. No, there’s gotta be something else.

!important everything! Bleh

Cliche and ugly. I would hate editing that. And what if the other styles are already !important?

An external stylesheet? Close, but imperfect.

Ideally we would use an external stylesheet that only applies to the socialbar. To do that, we need a very specific selector on the page, i.e. a unique ID. But if we have a fixed ID, then there can only be one socialbar per page. Even then, we might not avoid the CSS specificity issues. What if the site’s CSS has a double ID selector? Simple: we’re screwed. That kind of thing isn’t as rare as you might hope.

CSS Injection! YES!!!

The idea is to define a stylesheet, then have javascript change all the CSS selectors to be as specific as possible, thereby guaranteeing that our selectors win. It can also dynamically add !important to every property if it seems necessary. Our specificity issues are solved! But this solution is beneficial in other ways:

  • Injected CSS is basically a stylesheet that only applies to part of your page. Cool, right? The same widget can be copied multiple times on the same page without conflicts.
  • Since injecting doesn’t set inline CSS properties, you avoid the “can’t set back to default” problem.
  • You can write your CSS without worrying about specificity.
  • You don’t pollute the DOM with a million inline styles.

So if you’re a widget developer who cares about your styles breaking, inject some CSS to ease your pain! It’s helped me quite a bit already.

NTFS on Mac OS X

First, you may ask, why NTFS?

HFS is not well-supported in Windows or Linux. EXT is better than HFS, but has similar problems. FAT32 is the best supported across windows, mac, and linux, but it can’t handle large files. And I have some large files. That leaves NTFS. NTFS works great in Windows (duh) and Linux (hooray!), but not in Mac (boooo).

Okay, so maybe there’s something available already

There are a few pay solutions: Tuxera and Paragon are the main players from what I can see. Of the free options, MacFUSE combined with ntfs-3g (the ntfs-3g guys also make Tuxera) seems to be the consensus internet choice. But it’s getting a little long in the tooth: the last update for MacFUSE is from Dec 2008.

The problem is, I’m cheap, and I don’t have a particularly demanding use case for NTFS. It’s hard for me to justify $35 or even $20. I back up files to NTFS. I’m not doing heavy IO and I’m not in a time crunch. I just want simple functionality. After doing some traditional unix config to enable native NTFS support in Mac OS X (read, but don’t try), and thereby corrupting my 1TB backup drive, I decided my NTFS priorities, in reverse order, were:

  1. Fast read support
  2. Stable write support
  3. Don’t corrupt my hard drive
  4. Seriously, don’t corrupt my hard drive

Let’s make those priorities a reality

I tried MacFUSE and ntfs-3g, but there were some stability issues, similar to the native NTFS write support. For me, this meant more HD corruption.

After some sleuth work to solve my issues, I found a discussion of MacFUSE forks in development. Sweet, I thought, someone is solving my problems for me! That link is a little out of date, but it put OS X Fuse on my radar. To me, that’s the winner of the group, and it’s still being actively developed! NICE!

OS X Fuse solves some stability issues, and makes NTFS support work well enough for me. Catacombae’s free ntfs-3g drivers are apparently “good enough” when used with OS X Fuse.

Here’s how to set it up

  1. If you’ve installed any MacFUSE or ntfs-3g drivers already, uninstall them via the System Preferences panel.
  2. Download OS X Fuse and install it.
  3. Download Catacombae’s free ntfs-3g driver and install it.
  4. Plug in an NTFS volume and enjoy.

During the ntfs-3g install, it will ask if you want NTFS to default as safe and slow or unsafe and fast. It explains the technical difference right there. I chose safe and slow because that’s my priority usually. I may turn on unsafe and fast for other volumes individually–that’s an option in the System Preferences.

Conclusion

If your use case is similar to mine, save some money and use OS X Fuse + ntfs-3g. If you need a bit more security and speed, go with Paragon or Tuxera. And if you’re hardcore, do me a favor and update the EXT drivers for Windows and Mac so I don’t need to deal with this ancient filesystem anymore. Defragging is so 2001.

Disclaimer

My advice here is based purely on anecdotal evidence; I can’t vouch that it will work on all systems. Write to NTFS at your own risk!

On a Whim

The Back Story

When I first started at Wistia, my vim setup was bare. Honestly, I was struggling enough with my transition from Notepad++ to bother with customization. But when I finally got comfortable, it was time to hunt for plugins.

My vim environment grew, as did Jim’s and Brendan’s. We installed a ton of plugins, 100% haphazardly, to see what would stick. Seriously, if my ~/.vim folder was a house, it would have been condemned. Regardless, at this point, we were converging on a company-wide standard. We knew the plugins we liked and defaults that made sense. Only a few minor settings separated us.

We wanted a centralized location to share our setup, and we needed it available fast–a Wistia U class on vim (that we, er, Brendan, had to teach!) was coming up! So what to do? At first, we were ready to throw our whole setup into Dropbox. But why use Dropbox when we have Github?

And so, on a whim (and after few hours messing with vim runtimes), our company-wide vim setup was pushed to Github. Our non-techie brethren approved. Apprehensively, mind you, since they had no idea what any of this meant.

For them, here’s why whim is cool

  1. You get only the best hand-picked plugins.
  2. You get settings that are sane and improve your workflow.
  3. You can easily move these settings from computer to computer.
  4. You can customize your settings and plugins without changing the core of whim.
  5. You benefit from all the improvements others make along the way.
  6. Your ~/.vim folder will never be condemned.

For vim nerds, this is actually the second incarnation of whim

The first definitely would have been condemned, and there were issues keeping plugins updated. It was fixed with the rock-solid vim-pathogen, and the annoying but powerful git submodule feature. Here’s the basic structure of whim now:

  • vimrc (The core whim vimrc.)
  • gvimrc (The core whim gvimrc.)
  • bundle/ (Each plugin is in a separate folder in here.)
  • local/ (Each person can store their customizations here.)

Most of the folders in bundle/ are git submodules pointing to github repositories. The local/ folder contains max.vimrc, brendan.vimrc, jim.vimrc, etc. This is so we can easily try out other setups, or switch to our own on a different computer. Siiiick.

The plugins included in Whim

You’ll notice a heavy emphasis on plugins by tpope, and with good reason. He’s one of the best vim plugin devs out there, and managed to help me on my own integration with vim-fugitive. Anyway, here’s the list so far.

Check out whim on Github: https://github.com/wistia/whim

The Google +1 Button, Dynamic HTML, and You

With the rise of Google Plus, I thought it would be a good idea to experiment with the Google +1 button. At first, it really got on my nerves. Strange and seemingly arbitrary decisions abound. But in writing this article, I’ve come to appreciate their point of view. I’ll try to surmise their embed strategy along with other options they might have had.

First, you should check out Google’s +1 button generator. Here’s the default, with GB chosen as the language:

<!-- Place this tag where you want the +1 button to render -->
<g:plusone annotation="inline"></g:plusone>

<!-- Place this render call where appropriate -->
<script type="text/javascript">
  window.___gcfg = {lang: 'en-GB'};

  (function() {
    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
    po.src = 'https://apis.google.com/js/plusone.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
  })();
</script>

You’ll notice a few things about the markup.

  1. It uses non-standard HTML tags and attributes by default. You need to manually check “HTML5 Valid Syntax” in the Advanced tab to get valid markup.
  2. It uses inline JS to asynchronously load external JS by default.
  3. If you specify a language and uncheck “Asynchronous”, JSON parameters are added INSIDE the external javascript, i.e. <script src=”…”>{lang:’en-GB’}</script>

Is this code arbitrary or carefully selected? Hard to tell at first glance. Let’s reason it out.

  1. My best guess is that document.getElementsByTagName() is a superfast DOM traversal, and having a unique “g:plusone” makes sure they don’t lag on old browsers. For valid HTML (using a class), a DOM traversal is required. Ouch.
  2. With no customization, <script async> is probably a better solution. But it looks like Google chose configurability over simplicity. By using an inline script, they can set optional parameters before the external script is loaded, without hitting the app server. Why not just use two consecutive <script> tags? I’ll go into that in a later post. But in short: there are issues in dynamic environments.
  3. It would be awesome if we could pass parameters this way! But unless Google has discovered a secret syntax supported by all major vendors, it’s no savior. Still… the script is clearly using those params. How is it getting them? My assumption: the script knows its own URL; find the script tag on the page with the same URL! Then pass the innerHTML as JSON. Boom! Now mark the script tag as processed. Future script tags can ignore the processed params and use the next unprocessed set.
Google +1: What is your secret plan?

There’s obviously some merit to their approach. But there are hidden traps…

  1. Using <g:plusone> is brittle. It works in browsers, but other tools can choke on it. For example: try dynamically adding that with jQuery in IE8. Doesn’t work, huh? It works without jQuery. XHTML validators, which some WordPress instances use, may fail too.
  2. It’s long and ugly! Also, the browser behavior when you append an inline <script> is undefined. jQuery will eval it, but setting innerHTML directly won’t. Other libraries might not either. So this isn’t dependable in a dynamic environment!
  3. This processing method for external scripts is a cool idea. It requires that the script URL be extremely predictable forever, but hey, Google’s in a pretty stable place. The one issue is with libraries like jQuery. To keep the DOM “clean”, they remove external <script> tags after they’ve been executed! That means the parameters disappear. Damn. So close.

Google seems focused on doing things correctly for browsers in static environments by default. That’s the most common case I guess. All the options in the +1 button generator exist for edge cases, which in reality, appear often enough to be noticeable. By not using the most resilient setup by default, I predict continuous ongoing embedding issues for the Google +1 button.

What’s their next move?