Thursday 4 April 2013

OrientDB: new GraphDB Engine in beta

London, April 4th 2013

After about one month spent on development and test the NuvolaBase team has released the new GraphDB Engine!

The new Engine uses some novel techniques based on the idea of a dynamic Graph that change shape at run-time based on the settings and content. The new Engine is much faster than before and needs less space in memory and disk. Below the main improvements:

  1. avoid creation of edges as document if haven't properties. With Graphs wit no properties on edges this can save more than 50% of space on disk and therefore memory with more chances to have a big part of database in cache. Furthermore this speed up traversal too because requires one record load less. As soon as the first property is set the edge is converted transparently
  2. Vertex "in" and "out" fields aren't defined in the schema anymore because can be of different types and change at run-time adapting to the content:
    1. no connection = null (no space taken)
    2. 1 connection = store as LINK (few bytes)
    3. >1 connections = use the Set of LINKS (using the MVRBTreeRIDSet class)
  3. binding of Blueprints "label" concept to OrientDB sub-classes. If you create an edge with label "friend", then the edge sub-type "friend" will be used (created by the engine transparently). This means:
    1. 1 field less in document (the field "label") and therefore less space and the ability to use the technique 1 (see above)
    2. edges are stored on different files at file system level because are used different clusters
    3. better partitioning against multiple disks (and in the future more parallelism)
    4. direct queries like "select from friend" rather than "select from E" and then filtering the result-set looking for the edge with the wanted label property
  4. multiple properties for edges of different labels. Not anymore a "in" and "out" in Vertex but "out_friend" to store all the outgoing edges of class "friend". This means faster traversal of edges giving one or multiple labels avoiding to scan the entire Set of edges to find the right one
  5. with such dynamic Graph in future we could support also HyperGraph in a flash

Such new Engine needed new API or a radical change to the current Raw API breaking the compatibility with the past. Well, we decided to change strategy by re-implementing the Blueprints Graph layer as new GraphDB Engine. So the new GraphDB Engine IS the OrientDB's Blueprints implementation.

Why? Mainly because:
  1. Blueprints is the de facto standard for Graph Databases made by TinkerPop team
  2. TinkerPop team is amazing with a lot of technologies built on top of Blueprints layer
  3. Latest release of Blueprints added some new features to allow the implementations to use the underlying engine in more powerful way
  4. Blueprints API and all the TinkerPop stack is very well documented with a lot of examples and a new Book that is coming
The new GraphDB engine depends on OrientDB 1.4.0-SNAPSHOT, so we can't push it to TinkerPop repository because no SNAPSHOT are allowed as dependencies. As soon as we release OrientDB 1.4 we're going to merge it with official TinkerPop Blueprint's repository.

Starting from OrientDB 1.4 the GraphDB API to use are the Blueprints. Period. I'm sure this will make happy some users because Raw API are horrible and you've to work at document level using the OGraphDatabase class for any operations against vertices and edges (not really Object Oriented).

Waiting for the official release you can enjoy by cloning and start using the new GraphDB Engine from the master branch of NuvolaBase's Blueprints fork: https://github.com/nuvolabase/blueprints. It passes all the Blueprints Test Cases.

To open databases created with previous releases uses:

OrientGraph graph = new OrientGraph("local:/temp/mydb");
graph.setUseLightweightEdges(false);
graph.setUseVertexFieldsForEdgeLabels(false);
graph.setUseCustomClassesForEdges(false);

In the next days will be released a new tool to convert the databases to the new format.


Luca Garulli
CEO at NuvolaBase.com
the Company behind OrientDB
Follow me on http://twitter.com/lgarulli

3 comments:

  1. Awesome! Looking forward to trying the new engine this weekend.

    ReplyDelete
  2. will this break other 3rd party language binding? (Especially ones use remote binary protocol)

    ReplyDelete
  3. No breaks, the underlying layer (document) is the same. Will change the way the GraphDB Blueprints layer uses the underlying storage.

    ReplyDelete

Note: only a member of this blog may post a comment.