Posts Tagged ‘couchdb’

What happened to PouchDroid?

Just in case anyone is wondering about PouchDroid, a few things changed since I started working on it three months ago:

  1. I found out CouchBase Lite exists. So you may want to try that instead of some crazy JavaScript thing.
  2. I started working on PouchDB itself.

I do plan on eventually updating PouchDroid, but for the time being there’s still plenty to be done in PouchDB. Long-term goals will be:

  1. API parity with PouchDB (at least, the important parts).
  2. Get a rigorous suite of tests in place.
  3. Once the tests are passing, port it to pure Java.

If you still want to use PouchDroid in a Cordova/PhoneGap app, I’d recommend using PouchDB itself and the SQLite plugin instead. PouchDroid uses a slightly modified version of the SQLite plugin, which probably won’t give you any noticeable performance improvements. As for the XHR overrides, I’ll probably take that out, so you should just set up CORS on your CouchDB.

If you want to use PouchDroid in a Java Android project, it’s still pretty nifty for small sync tasks (e.g. the PouchDroidMigrationTask, which can sync a SQLite database to CouchDB). But if you want something more full-featured and reliable, wait for 1.0.

Porting PouchDB to Android: initial work and thoughts

Update: CouchDroid has been renamed to PouchDroid, and I’ve released version 0.1.0. The code examples below are out of date. Please refer to the instructions and tutorials on the GitHub page.

I love PouchDB. It demonstrates the strength and flexibility of CouchDB, and since it supports both WebSQL and IndexedDB under the hood, it obviates the need to learn their separate APIs (or to worry about the inevitable browser inconsistencies). If you know CouchDB, you already know PouchDB.

And most importantly, it offers two-way sync in just a few lines of code. To me, this is magical:

var db = new PouchDB('mydb')
db.replicate.to('http://foo.com:5984/db', {continuous : true});
db.replicate.from('http://foo.com:5984/db', {continuous : true});

I wanted to bring this same magic to Android, so I started working on an Android adapter for PouchDB. I’m calling it CouchDroid, until I can think of a better name. The concept is either completely crazy or kinda clever, which is why I’m writing this post, in the hopes of getting early feedback.

The basic idea is this: instead of rewriting PouchDB in Java, I fire up an invisible WebView that runs PouchDB in JavaScript. I override window.openDatabase to redirect to the native Java SQLite APIs, so that all of the SQL queries run on a background thread (instead of tying up the UI thread, like they normally would). I also redirect XMLHttpRequest into Java, giving me control over the HTTP request threads, and helping avoid any messy server-side configuration of CORS/JSONP for web security.

Result: it works on a fresh CouchDB installation, no assembly required. And it’s actually pretty damned fast.

The code is still a little rough around the edges, but it can already do bidirectional sync, which is great. Callbacks look weird in Java, but static typing, generics, and content assist make the Pouch APIs a dream to work with. (My precious Ctrl+space works!)

Here’s an example of bidirectional sync between two Android devices and a CouchDB server using CouchDroid. First, we define what kinds of documents we want to sync by extending PouchDocument. This is Android, so let’s store some robots:

public class Robot extends PouchDocument {

  private String name;
  private String type;
  private String creator;
  private double awesomenessFactor;
  private int iq;
  private List<RobotFunction> functions;

  // constructors, getters, setters, toString...
}
public class RobotFunction {

  private String name;

  // constructors, getters, setters, toString...
}

I’m using Jackson for JSON serialization/deserialization, which means that your standard POJOs “just work.” The PouchDocument abstract class simply adds the required CouchDB fields _id and _rev.

In our Activity, we extend CouchDroidActivity (needed to set up the Java <-> JavaScript bridge), and we add a bunch of robots to a PouchDB<Robot>:

public class MainActivity extends CouchDroidActivity {

  private PouchDB<Robot> pouch;

  // onCreate()...

  @Override
  protected void onCouchDroidReady(CouchDroidRuntime runtime) {

    pouch = PouchDB.newPouchDB(Robot.class, runtime, "robots.db");

    List<Robot> robots = Arrays.asList(
      new Robot("C3P0", "Protocol droid", "George Lucas", 0.4, 200, 
        Arrays.asList(
          new RobotFunction("Human-cyborg relations"),
          new RobotFunction("Losing his limbs"))),
      new Robot("R2-D2", "Astromech droid", "George Lucas", 0.8, 135,
        Arrays.asList(
          new RobotFunction("Getting lost"),
          new RobotFunction("Having a secret jetpack"),
          new RobotFunction("Showing holographic messages")))    
    );

    pouch.bulkDocs(robots, new BulkCallback() {

        @Override
        public void onCallback(PouchError err, List<PouchInfo> info) {
          Log.i("Pouch", "loaded: " + info);
        }
    });
  }
}

Meanwhile, on another Android device, we load a completely different list of robots:

List<Robot> robots = Arrays.asList(
  new Robot("Mecha Godzilla", "Giant monster", "Toho", 0.4, 82, 
    Arrays.asList(
      new RobotFunction("Flying through space"),
      new RobotFunction("Kicking Godzilla's ass"))),
  new Robot("Andy", "Messenger robot", "Stephen King", 0.8, 135,
    Arrays.asList(
      new RobotFunction("Relaying messages"),
      new RobotFunction("Betraying the ka-tet"),
      new RobotFunction("Many other functions"))),
  new Robot("Bender", "Bending Unit", "Matt Groening", 0.999, 120,
    Arrays.asList(
      new RobotFunction("Gettin' drunk"),
      new RobotFunction("Burping fire"),
      new RobotFunction("Bending things"),
      new RobotFunction("Inviting you to bite his lustrous posterior")))
  );

And, of course, we set up bidirectional replication on both pouches:

String remoteCouch = "http://user:password@myhost:5984/robots";
Map<String, Object> options = Maps.quickMap("continuous", true);

pouch.replicateFrom(remoteCouch, options);
pouch.replicateTo(remoteCouch, options);

Wait a few seconds (or pass in a callback), and voilĂ ! You can check the contents on CouchDB:
C3P0 on CouchDB

And then check the contents of each PouchDB on Android:

pouch.allDocs(true, new AllDocsCallback<Robot>() {

@Override
  public void onCallback(PouchError err, AllDocsInfo<Robot> info) {
    List<Robot> robots = info.getDocuments();
    Log.i("Pouch", "pouch contains " + robots);
  }
});

This prints:

pouch contains [Robot [name=Bender, type=Bending Unit, creator=Matt Groening, 
awesomenessFactor=0.999, iq=120, functions=[RobotFunction [name=Gettin' drunk], RobotFunction 
[name=Burping fire], RobotFunction [name=Bending things], RobotFunction [name=Inviting you to bite 
his lustrous posterior]]], Robot [name=C3P0, type=Protocol droid, creator=George Lucas, 
awesomenessFactor=0.4, iq=200, functions=[RobotFunction [name=Human-cyborg relations], RobotFunction 
[name=Losing his limbs]]], Robot [name=Mecha Godzilla, type=Giant monster, creator=Toho, 
awesomenessFactor=0.4, iq=82, functions=[RobotFunction [name=Flying through space], RobotFunction 
[name=Kicking Godzilla's ass]]], Robot [name=R2-D2, type=Astromech droid, creator=George Lucas, 
awesomenessFactor=0.8, iq=135, functions=[RobotFunction [name=Getting lost], RobotFunction 
[name=Having a secret jetpack], RobotFunction [name=Showing holographic messages]]], Robot 
[name=Andy, type=Messenger robot, creator=Stephen King, awesomenessFactor=0.8, iq=135, functions=
[RobotFunction [name=Relaying messages], RobotFunction [name=Betraying the ka-tet], RobotFunction 
[name=Many other functions]]]]

So within seconds, all five documents have been synced to two separate PouchDBs and one CouchDB. Not bad!

In addition to adapting the PouchDB API for Java, I also wrote a simple migration tool to mirror an existing SQLite database to a remote CouchDB. It could be useful, if you just want a read-only web site where users can view their Android data:

new CouchDroidMigrationTask.Builder(runtime, sqliteDatabase)
    .setUserId("fooUser")
    .setCouchdbUrl("http://user:password@foo.com:5984/db")
    .addSqliteTable("SomeTable", "uniqueId")
    .addSqliteTable("SomeOtherTable", "uniqueId")
    .setProgressListener(MainActivity.this)
    .build()
    .start();

This converts SQLite data like this:

sqlite&gt; .schema Monsters
CREATE TABLE Monsters (_id integer primary key autoincrement, 
  uniqueId text not null,
  nationalDexNumber integer not null,
  type1 text not null,
  type2 text,
  name text not null);
sqlite&gt; select * from Monsters limit 1
1|001|1|Grass|Poison|Bulbasaur

into CouchDB data like this:

{
   "_id": "fooUser~pokemon_11d1eaac.db~Monsters~001",
   "_rev": "1-bd52d48dba37ce490c38d455726296f0",
   "table": "Monsters",
   "user": "fooUser",
   "sqliteDB": "pokemon_11d1eaac.db",
   "appPackage": "com.nolanlawson.couchdroid.example1",
   "content": {
       "_id": 1,
       "uniqueId": "001",
       "nationalDexNumber": 1,
       "type1": "Grass",
       "type2": "Poison",
       "name": "Bulbasaur"
   }
}

Notice that the user is included as a field and as part of the _id, so you can easily set up per-user write privileges. For per-user read privileges, you still need to set up one database per user.

CouchDroid isn’t ready for a production release yet. But even in its rudimentary state, I think it’s pretty damn exciting. As Android developers, wouldn’t it be great if we didn’t have to write so many SQL queries, and we could just put and get our POJOs? And wouldn’t it be awesome if that data were periodically synced to the server, so we didn’t even have to think about intermittent availability or incremental sync or conflict resolution or any of that junk? And wouldn’t our lives be so much easier if the data was immediately available in a RESTful web service like CouchDB, so we didn’t even need to write any server code? The dream is big, but it’s worth pursuing.

For more details on the project, check it out on GitHub. The sample apps in the examples directory are a good place to start. Example #1 is the migration script above, Example #2 is some basic CRUD operations on the Pouch API, and Example #3 is the full bidirectional sync described above. More to come!

CouchDB doesn’t want to be your database. It wants to be your web site.

I’d like to talk to you today about Couch apps. No, not CouchApps. No, not necessarily CouchApps either. The phrase has been bandied around a lot, so it’s worth explaining what I mean: I’m talking about webapps that exclusively use CouchDB for their backend, whether or not they’re actually hosted within CouchDB and regardless of how they’re built.

Yes, this is a thing people are actually trying to do, and no, it’s not crazy. The purpose of this article is to explain why.

First off, some background: CouchDB is a NoSQL database (or key-value store, as the cool kids say) written in Erlang. It is probably the origin of this joke. Nobody who uses CouchDB cares that it is written in Erlang, though, because the big selling point is that you can interact with it using Javascript, JSON, and plain ol’ HTTP. It is “a database for the web,” the first of its kind.

CouchDB: it’s a database, right?

When I first started using CouchDB, I tried to treat it like any other database. I looked for connectors based on the language I was using: Ektorp for Java, AnyEvent::CouchDB for Perl, Nano for Node. And I used the web interface (bewilderingly called “Futon”) as I would a query browser – neat for debugging, but not much else. The fact that it ran in a web browser just kinda seemed like a gimmick.

Recently, though, when I was working on a Node app that didn’t go anywhere but was a fun diversion, I came across this quote by Couch apostle J. Chris Anderson:

Because CouchDB is a web server, you can serve applications directly [to] the browser without any middle tier. When I’m feeling punchy, I like to call the traditional application server stack “extra code to make CouchDB uglier and slower.”

Suddenly, I realized what CouchDB was all about.

No wait, CouchDB is a miracle

See, here I was, using client-side Javascript to talk to Express to talk to Node to talk to Nano to talk to Couch, and at each step I was converting parameter names from underscores to camel case (or whatever my petty hangups are), all the while introducing bugs as I tried to make each layer fit nicely with the next one. And I had a working web server right in front of me! CouchDB! Why not just call it directly, you fool?! (I shout at myself in hindsight.)

I think the reason a lot of developers, like myself, might have missed this epiphany is that we’re used to treating databases as, well, databases. Whether it’s MongoDB or MySQL or Oracle, you gotta have your JDBC connector for Java and perhaps an ORM layer or maybe you just give up on Hibernate and write all the database objects yourself, so half of your code is getters and setters, but that’s OK, because that’s how we abstract the database.

You see, you can’t just have your peanut butter and jelly sandwich! You need an interface between the bread and the peanut butter, and an abstraction layer between the peanut butter and the jelly, and don’t even get me started on the jelly and the bread! What, you want your bread to get soggy?

As a programmer, I’m so used to treating databases as this other, alien thing that needs to be handled with latex gloves, separately from my application code, that reaching for the nearest library has become a reflex.

But you don’t need that with CouchDB. Because… it’s just HTTP. Any extra layers just give you another API to learn.

CouchDB is the web done right

And in fact, CouchDB is better than HTTP, because CouchDB actually fulfills the promise of what RESTful services were supposed to be, instead of the kludges we’ve come to expect. Look! DELETE actually deletes things! POST isn’t just what you use when you need to send more data than a GET allows! And HEAD and PUT are actually useful, instead of just being trivia to impress your friends at dinner parties — “Oh, did you know that there are actually more HTTP commands than just GET and POST?” “Oh, how fascinating!”

You see, once you set aside your preconceived notion of what a database is supposed to be, you can actually get rid of all your fancy connectors and just use a standard HTTP library. (I like Requests for Python.) You can even use the network debugger in a browser window to see how CouchDB does everything. It’s all just AJAX!

And then, if you make it this far down the rabbit hole, you might notice that CouchDB actually has a user authentication database, with password hashing. You might also notice that it’s even got roles and privileges and administrator controls. And that’s when you realize, with fascinated horror, the most insidious thing about CouchDB:

CouchDB doesn’t want to be your database; it wants to be your web site.

And finally, this is where we come back to the subject of Couch apps. A Couch app is just a pure HTML/CSS/Javascript application, with only CouchDB as its backend, and this is the intended use case for CouchDB.

Now, think about what this proposition means to you as a developer. The web is moving more and more towards rich, client-side applications — we’ve had jQuery for years, and now we even have MVC with platforms like Ember, Knockout, and AngularJS. If CouchDB does user authentication (it’s got a “signup” button right on the home page, for crying out loud), paging, indexing, full-text search, geo data, and it all speaks HTTP, well… what does that actually leave us to do on the server?

Take a long look in the mirror, and really ask yourself! And yes, for those of you who do machine learning and scientific computing and business intelligence, I can already see you raising your hands, but for the rest of us who get paid to write Twitter clones, the answer is: not much. Your average CRUD app can magically transform into a PGPD app (PUT, GET, POST, DELETE), you can throw it up on CouchDB with some nice HTML and CSS to style it, and be at your local brewpub by 3. Or maybe you could just send the default Futon interface to the client and tell them you wrote it.

Futon interface in CouchDB

“See, it’s a collaborative document editor, and the dude on the Couch is a lazy writer…”

Now, this is the dream. And CouchDB, as it stands in 2013, actually gets us pretty damn far toward that dream. The app I’m releasing this week, Ultimate Crossword, is a testament to that. It’s a pure Couch app that only cheats by using Solr for full-text search (because I was too lazy to learn the Lucene plugin). It’s got user accounts, data aggregation, and even continuous syncing between the client and server thanks to the wonderful PouchDB.

Building this site gave me a lot of insight into what’s possible with a Couch app. However, I also got a reality check about where CouchDB still falls short of achieving the dream. I’ve got four big complaints:

1) No per-document read privileges

This is a big one. CouchDB has three basic security modes:

  • Everyone can do everything.
  • Some people can write (some documents), everyone can read (all documents).
  • Some people can write (some documents), some people can read (all documents).

If you want to give users exclusive read access to certain documents, you have to create a separate database for each user. And unfortunately, CouchDB has no feature to do this automatically. So you need a process on the server with administrative privileges to do it, breaking the pure “Couch app” ideal. Then, if you want to aggregate the data, you actually need another process to sync to a separate database, and… well, it just gets messy. I’m strongly rooting for this feature to show up in a future CouchDB release.

2) No password recovery.

This is a feature that users have come to expect from modern web sites. And despite all its security flaws (in that it makes your email a single point of failure), it seems here to stay.

Now, CouchDB can store arbitrary data in the users table (like email addresses), and you can even do custom validation. But for the whole “give us your email, and we’ll send you a new password” thing, you’re on your own.

On the bright side, the passwords are all salted and PBKDF2-hashed, so no attacker has much to gain from cracking your Couch.

3) No database migration.

This is a big one for me, although I wonder if I’m the only one. Since my early days of Java development, I’ve appreciated having Liquibase so I could track my database schema changes in version control.

In theory, CouchDB should be ideal for something like this, since it versions everything, and even its views (aka indexes) are their own documents. But I haven’t found a good recipe for managing this yet. For the time being, I just keep a series of Python scripts that create the databases.

4) Views are not indexes, and documents are not tables.

One of the nice things about SQL databases as a development paradigm is the flexibility of the SQL language itself. Decided you wanna sort by dogsLastName instead of favoritePokemon? No problem, we’ll just add an index. Too much data getting sent across the wire? No big deal, we’ll just SELECT the fields we need, instead of SELECT(*).

In CouchDB, you can’t do a WHERE and you can’t just SELECT the fields you want. Any query that’s not simply fetching a whole document by its ID requires a view, and those are costly to create. I’ve worked with Couch databases containing millions of documents, and rebuilding a view would often take days. I’d have a coworker ask me to add a new filter criterion for a view, and on Friday I’d say, “Okay, it’ll be ready by Monday.” For the Ultimate Crossword app, I stupidly decided to use CouchDB to crunch the data itself, and I ended up needing five separate Couch servers running on solid state drives in order to process it in in a reasonable amount of time. (CouchDB is best thought of as a single-process application. It’s append-only, so it uses one process per database file.)

Also, the fact that you can’t SELECT arbitrary fields means you need to start thinking about how much data you want to send over the wire with each document, and how to threshold it. I found myself structuring my database into a summary/detail format early on, and modeling the documents very tightly to the user interface, in ways that just made me feel icky.

Database purists, of course, would say that this is where the latex gloves are supposed to come out. But I think that if CouchDB simply had a better system for managing migrations (see #3) and/or faster view creation, this would be a non-issue. I’d also love it if the output of a view could be put into its own database, so I could have endlessly kaleidoscoping views of my data. One more for the wishlist!

Conclusion

Despite these drawbacks, I still think CouchDB has a lot of potential to revolutionize the way people write webapps. I certainly still plan to use it for quick hacking (hell, the crossword app only took me ten days to write), and Couch’s append-only design means I’ll never have to worry about my data getting corrupted. (It’s been proudly touted as “the Honda Accord of databases.”)

But for all its developers’ humility, CouchDB is a really exciting technology. When you step back and look at it, it’s a daring, crazy proposition, a bold statement about how awesome web development would be if we could just let it be the web. It’s a raving streetside lunatic, grabbing random people by the shoulders and screaming at them with frantic urgency: “We don’t need the server anymore! We only need the database! The database is the server!”

In short, CouchDB is an expression of an ideal, a fantastical tale of science fiction told by wide-eyed dreamers. And if there’s one truth about wide-eyed dreamers, it’s this: with hindsight, their predictions either seem delusional, or inevitable.

(Psssst! Go check out my Ultimate Crossword app! It’ll make you feel bad about your user authentication!)

Update: I decided to remove the CouchDB user authentication from the Ultimate Crossword app (I realized it was irresponsible to let people collaboratively “solve” the puzzle), but it’s still a pure Couch app!