modules | Read the Tea Leaves

Posts Tagged ‘modules’

19 Oct

The struggles of publishing a JavaScript library

Posted by Nolan Lawson in Webapps. Tagged: javascript, modules, npm. 19 comments

If you’ve done any web development in the past few years, then you’ve probably typed something like this:

$ bower install jquery

Or maybe even:

$ npm install --save lodash

For anyone who remembers the dark days of combing Github for jQuery plugins, this is a miracle. But as with all software, somebody had to write that code in order for you to be able to download it. And in the case of tools like Bower and npm, somebody also had to do the legwork to publish it. This is one of their stories.

The Babelification of JavaScript

I tweeted this recently:

https://twitter.com/nolanlawson/status/653610332989059072

I got some positive feedback, but I also saw some incredulous responses from people telling me I only need to support npm and CommonJS, or more snarkily, that supporting “just JavaScript” is good enough. As a fairly active open-source JavaScript author, though, I’d like to share my thoughts on why it’s not so simple.

The JavaScript module ecosystem is a mess these days. For module definitions, we have AMD, UMD, CommonJS, globals, and ES6 modules ¹. For distribution, we have npm, Bower, and jspm, as well as CDNs like cdnjs, jsDelivr, and Github itself. For translating between Node and browser code, we have Browserify, Webpack, and Rollup.

Supporting each of these categories comes with its own headaches, but before I delve into that, here’s my take on how we got into this morass in the first place.

What is a JS module?

For the longest time, JavaScript didn’t have any commonly-accepted module system, so the most straightforward way to distribute your code was as a global variable. jQuery plugins also worked this way – they would just look for the global window.$ or window.jQuery and hook themselves onto that.

But thanks largely to Node and the influx of people who care about highfalutin computer-sciencey stuff like “not polluting the global namespace,” we now have a lot more ways of modularizing our code. npm is famous for using CommonJS, with its module.exports and require(), whereas other tools like RequireJS use an alternative format called AMD, known for its define() and asynchronous loading. (It’s never ceased to confuse me that RequireJS is the one that doesn’t use require().) There’s also UMD, which seeks to harmonize all of them (the “U” stands for “universal”).

In practice, though, there’s no good “universal” way to distribute your code. Many libraries try to dynamically determine at runtime what kind of environment they’re in (here’s a pretty gnarly example), but this makes modularizing your own code a headache, because you have to repeat that boilerplate anywhere you want to split up your code into separate files.

More recently, I’ve seen a lot of modules migrate to just using CommonJS everywhere, and then bundling it up for distribution with Browserify. This can be fraught with its own difficulties though, if you aren’t aware of the subtleties of how your code gets consumed. For instance, if you use Browserify’s --standalone flag (-s), then your code will get built as an AMD-ready, UMD-ready, and globals-ready bundle file, but you might not think to add it as a build step, because the stated use of the --standalone flag is to create a global variable ².

However, my new personal policy is to use this flag everywhere, even when I can’t think of a good global variable name, because that way I don’t get issues filed on me asking for AMD support or UMD support. (Speaking of which, it still tickles me that someone had to actually open an issue asking me to support a supposedly “universal” module system. Not so universal after all, is it!)

Package managers and pseudo-package managers

So let’s say you go the CommonJS + Browserify route: now you have an interesting problem, which is that you have both a “source” version and a “distributed” version of your code. (Commonly these are organized into a src/lib folder and a dist folder, but those are just conventions.) How do you make sure your users get the right one?

npm is a package manager that expects CommonJS modules, so typically in your package.json, you set the "main" key to point to whatever your source "src/index.js" file is. Bower, however, expects a bundle file that can be directly included as a <script> tag, so in that case you’ll want to set the "main" inside the bower.json to point instead to your "dist/mypackage.js" or "dist/mypackage.min.js" file. jspm complicates things further by defaulting to npm’s package.json file while actually expecting non-CommonJS modules, but you can override that behavior by including a {"jspm": "main": "dist/mypackage.js"}} in your package.json. Whew! We’re all done, right?

Not so fast. As it turns out, Bower isn’t really a package manager so much as a CLI over Github. What that means is that you actually need to check your bundle files into Git, to ensure that those dist/ files are available to Bower users. At the same time, you’ll have to be very cognizant not to check in anything you don’t want people to download, because Bower’s "ignore" list doesn’t actually avoid downloading anything; it just deletes the ignored files after they’re downloaded, which can lead to some enormous Bower downloads. Couple this with the fact that you’re probably also juggling .gitignore files and .npmignore files, and you can end up with some fairly complicated release scripts!

Of course, many users will also just download your bundle file from Github. So it’s important to be consistent with your Git tags, so that you can have a nice tidy Github releases page. As it turns out, Bower will also depend on those Git tags to determine what a “release” is – actually, it flat-out ignores the "version" field in bower.json. To make sense of all this complexity, our policy with PouchDB is to just do an explicit commit with the version tag that isn’t even a part of the project’s main master branch, purely as a “release commit” for Bower and Github.

What about CDNs?

Github discourages using their hosted JavaScript files directly from <script> tags (in fact their HTTP headers make it impossible), so often users will ask if they can consume your library via a CDN. CDNs are also great for code snippets, because you can just include a <script> tag pointing to the latest CDN release. So lots of libraries (including PouchDB) also support jsDelivr and cdnjs.

You can add your library manually, but in my experience this is a pain, because it usually involves checking out the entire source for the CDN (which can be many gigabytes) and then opening a pull request with your library’s code. So it’s better to follow their automated instructions so that they can automatically update whenever your code updates. Note that both jsDelivr and cdnjs rely on Git tags, so the above comments about Github/Bower also apply.

Correction: Both jsDelivr and cdnjs can be configured to point to npm instead of Github; my mistake! The same applies to jspm.

Browser vs Node

For anyone who’s written a popular JavaScript library, the situation inevitably arises that someone tries to use your Node-optimized library in the browser, or your browser-optimized library in Node, and invariably they run into issues.

The first trick you might employ, if you’re working with Browserify, is to add if/else switches anytime you want to do something differently in Node or the browser:

function md5(str) {
  if (process.browser) {
    return require('spark-md5').hash(str);
  } else {
    return require('crypto').createHash('md5').update(str).digest('hex');
  }
}

This is convenient at first, but it causes some unexpected problems down the line.

First off, you end up sending unnecessary Node code to the browser. And especially if the Browserified version of your dependencies is very large, this can add up to a lot of bytes. In the example above, Browserifying the entire crypto library comes out to 93KB (after uglify+gzip!), whereas spark-md5 is only 2.6KB.

The second issue is that, if you are using a tool like Istanbul to measure your code coverage, then properly measuring your coverage in Node can lead to a lot of /* istanbul ignore next */ comments all over the place, so that you can avoid getting penalized for browser code that never runs.

My personal method to avoid this conundrum is to prefer the "browser" field in package.json to tell Browserify/Webpack which modules to swap out when building. This can get pretty complicated (here’s an example from PouchDB), but I prefer to complicate my configuration code rather than my JavaScript code. Another option is to use Calvin Metcalf’s inline-process-browser, which can automatically strip out process.browser switches ³.

You’ll also want to be careful when using Browserify transforms in your code; any transforms need to be a regular dependency rather than a devDependency, or else they can cause problems for library users.

Wait, you tried to run my code where?

After you’ve solved Node/browser switching in your library, the next hurdle you’ll likely encounter is that there is some unexpected bug in an exotic environment, often due to globals.

One way this might manifest itself is that you expect a global window variable to exist in the browser – but oh no, it’s not there in a web worker! So you check for the web worker’s self as well. Aha, but NW.js has both a Node-style global and browser-style window as global variables, so you can’t know in advance which other globals (such as Promise or console) are attached to which! Then you can get into even stranger environments like iOS’s JSCore (which is used by React Native), or Electron, or Qt WebKit, or Rhino/Nashorn, or Java FXWebView, or Adobe Air…

If you want to see what kind of a mess this can create, check out these lines of code from Lodash, and weep for poor John-David Dalton!

My own solution to this issue is to never ever check for window or global or anything like that if I can avoid it, and instead use typeof whatever === 'undefined' to check. For instance, here’s my typical Promise shim:

function PromiseShim() {
  if (typeof Promise !== 'undefined') {
    return Promise;
  }
  return require('lie');
}

Trying to access a global variable that doesn’t exist is a runtime error in most JavaScript environments, but using the typeof check will prevent the error.

Browserify vs Webpack

100% of my experience with webpack is users opening issues on my libraries because it does something subtly different then @browserify

— Professional Calvin Architect (@CWMma) October 12, 2015

Most library authors I know tend to prefer Browserify for building JavaScript modules, but especially with the rise of React and Flux, Webpack is increasingly becoming a popular option.

Webpack is mostly consistent with Browserify, but there are points of divergence that can lead to unexpected errors when people try to require() your library from Webpack. The best way to test is to simply run webpack on your source CommonJS file and see if you get any errors.

In the worst case, if you have a dependency that doesn’t build with Webpack, you can always tell users to specify a custom loader to work around the issue. Webpack tends to give more control to the end-user than Browserify does, so the best strategy is to just let them build up your library and dependencies however they need to.

Enter ES6

This whole situation I’ve described above is bad enough, but once you add ES6 to the mix, it gets even more complicated. ES6 modules are the “future-proof” way of authoring JavaScript, but as it stands, there are very few tools that can consume ES6 directly, including most versions of Node.

(Yes, even if you are using Node 4.x with its many lovely ES6 features like Promises and arrow functions, there are still some missing features, like spread arguments and destructuring, that are not supported by V8 yet.)

So, what many ES6 authors will do is add a "prepublish" script to build the ES6 source into a version consumable by Node/npm (here’s an example). (Note that your "main" field in package.json must point to the Node-ready version, not the ES6 version!) Of course, this adds a huge amount of additional complexity to your build script, because now you have three versions of your code: 1) source, 2) Node version, and 3) browser version.

When you add an ES6 module bundler like Rollup, it gets even hairier. Rollup is a really cool bundler that offers some big benefits over Browserify and Webpack (such as smaller bundle sizes), but to use it, it expects your library’s dependencies to be exported in the ES6 format.

Now, because npm normally expects CommonJS, not ES6 modules, there is an informal “jsnext:main” field that some libraries use to point to their ES6 source. Usage is not very widespread, though, so if any of your dependencies don’t use ES6 or don’t have a "jsnext:main", then you’ll need to use Rollup’s --external flag when bundling them so that it knows to ignore them.

"jsnext:main" is a nice hack, but it also brings up a host of unanswered questions, such as: which features of ES6 are supported? Is it a particular stage of recommendation for the spec, ala Babel? What about popular ES7 features that are already starting to creep into codebases that use Babel, such as async/await? It’s not clear, and I don’t think this problem will be resolved until npm takes a stance one way or the other.

Making sense of this mess

At the end of the day, if your users want your code bad enough, then they will find a way to consume it. In the worst case scenario, they can just copy-paste your code from Github, which is how JavaScript was consumed for many years anyway. (StackOverflow was a decent package manager long before cooler kids like npm and Bower came along!)

Many folks have advised me to just support npm and CommonJS, and honestly, for my smaller modules I’m doing just that. It’s simply too much work to try to support everything at once. As an example of how complicated it is, I’ve created a hello-javascript module that only contains the code you need to support all the environments above. Hopefully it will help someone trying to figure out how to publish to multiple targets.

If you happen to be thinking about hopping into the world of JavaScript library authorship, though, I recommend starting with npm’s publishing guide and working your way up from there. Trying to support every JavaScript user on the planet is an ambitious proposition, and you don’t want to wear yourself out when you’re having enough trouble testing, writing documentation, checking code coverage, triaging issues, and hey – at some point, you’ll also need to write some code!

But as with everything in software, the best advice is to focus on the user and all else will follow. Don’t listen to the naysayers who tell you that Bower users are “wrong” and you’re doing them a favor by “educating” them ⁴. Work with your users to try to support their use case, and give them alternatives if they’re unsatisfied with your current publishing approach. (I really like wzrd.in for on-demand Browserification.)

To me, this is somewhat like accessibility. Some users only know Bower, not npm, or maybe they don’t even understand the difference between the two! Others might be unfamiliar with the command line, and in that case, a big reassuring “Download” button on a github.io page might be the best way to accommodate them. Still others might be power users who will try to include your ES6 code directly and then Browserify it themselves. (Ask those users for a pull request!)

At the end of the day, you are giving away your labor for free, so you shouldn’t feel obligated to bend over backwards for anybody. But if your driving motivation is to make your code as usable as possible for other people, then I’d say you can’t go wrong by supporting the two most popular options: direct downloads for casual users, and npm/CommonJS for power users. If your library grows in popularity, you can always worry about the thousand and one other methods later. ⁵

Thanks to Calvin Metcalf, Nick Colley, and Colin Skow for providing feedback on a draft of this post.

Footnotes

1. I’ve seen no compelling reason to call it “ES2015,” except to signal my own status as a smarty-pants. So I don’t.

2. Another handy tool is derequire, which can remove all require()s from your bundle to ensure it doesn’t get re-interpreted as a CommonJS module.

3. Calvin Metcalf pointed out to me that you can also work around this issue by using crypto sub-modules, e.g. require('crypto-hash'), or by fooling Browserify via require('cryp' + 'to').

4. With npm 3, many developers are starting to declare Bower to be obsolete. I think this is mostly right, but there are still a few areas where Bower beats npm. First off, for isomorphic libraries like PouchDB, an npm install can be more time-consuming and error-prone than a bower install, due to native LevelDB dependencies that you’ll never need if you’re only using PouchDB on the frontend. Second, not all libraries are publishing their dist/ code to npm, meaning that former Bower users would have to learn the whole Browserify/Webpack stack rather than just include a <script> tag. Third, not all Bower modules are even on npm – Ionic framework is a popular one that springs to mind. Fourth, there’s the social cost of migrating folks from Bower to npm, throwing away a wealth of tutorials and accumulated knowledge in the process. It’s not so simple to just tell people, “Okay, now start using npm instead of Bower.”

5. I’ve ragged a lot on the JavaScript community in this post, but I still find authoring for JavaScript to be a very pleasurable experience. I’ve been a consumer of Python, Java, and Perl modules, as well as a publisher of Java modules, and I still find npm to be the nicest to work with. The fact that my publish process is as simple as npm version patch|minor|major plus a npm publish is a real dream compared to the somewhat bureaucratic process for asking permission to publish to Maven Central. (If I ever have to see the Sonatype Nexus web UI again, I swear I’m going to hurl.)

Read the Tea Leaves Software and other dark arts, by Nolan Lawson