I think we can all agree that the last thing this world needs is another object-relational mapper. Except...we do. Just one more...
Imagine for a second if there were simple, well-defined, language-neutral constructs for describing database tables, constraints, relationships, queries, the whole lot. Imagine you could use these constructs to generate SQL, sometimes engine-specific SQL, for any and all of the operations even the most robust ORM may need. In this idealized world, all the various ORM maintainers from all the various languages could join hands, share the same awesome test suite, and sing kumbaya...
This imaginary meta-ORM, being relatively simple and only dealing in declarative constructs, could be easily ported to various languages -- but the test suite itself, that could be shared amongst everyone. The behavior of this meta-ORM is simple: it just takes one declarative input and returns a known output (engine-specific SQL). As bugs and quirky edge cases are found they can be added to some ever-growing test suite. Actual ORMs could compile down to these constructs without changing their APIs. Various NoSQL databases may also be able to benefit, and it would certainly be a boon to client-side devs hoping to use the new Web SQL Database API in a sane manner. This imaginary world would be pretty awesome, but alas, perhaps too far out of reach.
Of course, through various standards we already have a firm handle on the very constructs mentioned above for modeling, referencing and querying: XML-Schema, XLink and XQuery! Just kidding, really -- you can keep reading...
But seriously, buried in a steaming pile of angle brackets, these XML standards certainly have everything we'd need to achieve the above-stated ideal. But the outcome would be far from ideal. To paraphrase Mr. Crockford: XML is the union of all languages; JSON is the intersection. Perhaps in JSON we can find a simpler, more suitable solution?
As it happens a format suitable for describing SQL data models as JSON exists and thankfully, bears no resemblance to XML-Schema and its ilk. The json-schema standard can succinctly model objects fit for storage in SQL or NoSQL databases. A json-schema instance can be as flexible or rigidly descriptive as you need it to be -- of course if your object's bound for a SQL backend, you'll probably want it fairly rigid. Relationship definitions are handled by way of json-references and optionally, queries could be modeled using resource-query (no link yet), which is an offshoot of (but not strictly derived from) json-query. These standards give us a much simpler declarative language than any XML dialect could and provide the extensibility necessary to accommodate even the most arcane database features.
So while every ORM reinvents the wheel for SQL generation, it would be possible for them to instead use these standards as a compile target in some or all cases, and let this meta-ORM -- really just a SQL-builder -- do the heavy lifting, handling vendor quirks and crazy edge cases. Is there something I'm missing? Why doesn't this exist yet?
If you've been following along in the development of persevere 2.0 you may notice that these notions -- json-schema and resource-query -- also happen to be how perstore, persevere's persistent framework, operates. This means that if such a meta-ORM were created using this approach it would drop directly into perstore, allowing it to talk to any supported SQL engine. This would make me very happy, sure, but frankly, it doesn't matter much whether it's json-schema and resource-query, some crazy XML dialects or something else entirely (though it would be nice if the formats were easy to parse and compile to) -- what matters most is that such a project exists. Perhaps it already does -- if so, I'd love to know about it. Otherwise, come all ye sqljockeys and let's start writing some specs...
There are three distinct applications for this meta-ORM: table and index creation, schema migrations and query building.
A good place to start would be defining the easier CREATE TABLE cases with just the scalars as defined in json-schema (string, number, integer, boolean, null). We could use the format key as a hint for things like date-time, date and time, but we'd need to extend json-schema to allow for BLOBs, XML and the like. The maxLength can tell us the column length. We would also need to extend json-schema to control index creation: a simple {"index": true} could offer a sane default but there are other switches users may want to flip. Of course, this is the beauty of something like json-schema for this -- a schema can be as simple or verbose as need be and can gracefully degrade on engines which don't support particular extensions. Of course, extensions are inevitable -- we just need to define them in a clear (and preferably vendor-neutral) way.
Some SQL-specific data may not belong in json-schema, for instance database and table names are irrelevant to an actual object model but are often required to generate SQL. We could define an options object for these cases.
A more complex, but really useful proposition would be the ability to compile ALTER statements given two schema instances. This would make writing apps with SQL backends suck significantly less. More power to you if you can go the NoSQL route but SQL, in many cases, is simply a fact of life for us developers. Some ORMs provide some variation on this feature, but we can do it once, do it right, and make SQL hurt less, in a generalizable way, so let's do that.
Compiling a SELECT statement from a simple resource-query should be trivial but semantics for more complex operations must be defined as well. The resource-query syntax was defined to be expressive as well as easy to parse (also to be legal in the url query component, but that's just a bonus). It exposes two key constructs that should enable a wide range of queries -- comparisons and calls. A comparison is something like "foo=bar". Greater than and less than operators can also be used: "foo>42". Calls are like method invocations: sort(foo) or distinct(bar). Calls can take more than one argument: sort(foo,-bar). Comparisons and calls can be chained together with & or | logic. This could be extended further if need be but should enable a whole range of query possibilities. Sure, it would not fully equal SQL's expressiveness but nothing would stop a user, or ORM, from dropping down into SQL where needed.
If you or someone you know could get down with this, ping @deanlandolt on twitter and let's get to work.