Tuesday, 2 October 2012

OrientDB multitenancy with partitioned graphs


Introduction

This tutorial explains step-by-step how to create partitioned graphs using the Record Level Security feature introduced in OrientDB 1.2.0. This feature is so powerful we can totally separate database's records as sand-boxes where each "Restricted" records can't be accessed by non authorized users. This tutorial demonstrates this sand-boxes works well also with the GraphDB API and the TinkerPop stackPartitioning graphs allows to build real Multi-tenant applications in a breeze.

Requirements:

  • OrientDB 1.2.0-SNAPSHOT or major
  • TinkerPop Blueprints 2.2.0 or major.

Index of contents

Create a new empty graph database

First open the console of the GraphDB Edition and create the new database "blog" of type "graph" against the local file-system:

$ cd $ORIENTDB_HOME/bin
$ console.shOrientDB console v.1.2.0-SNAPSHOT www.orientechnologies.comType 'help' to display all the commands supported.
Installing extensions for GREMLIN language v.2.2.0-SNAPSHOT

orientdb> create database local:../databases/blog admin admin local graphCreating database [local:../databases/blog] using the storage type [local]...
Database created successfully.
Current database is: local:../databases/blog

Enable graph partitioning

Now turn on partitioning against graph by letting classes V (Vertex) and E (Edge) to extend the ORestricted class. In this way any access to Vertex and Edge instances can be restricted:

orientdb> alter class V superclass ORestrictedClass updated successfully

orientdb> alter class E superclass ORestricted
Class updated successfully

Create 2 users

Now let's go creating 2 users: "luca" and "steve". First ask the current roles in database to know the "writer" role's rid:

orientdb> select from orole
---+---------+--------------------+--------------------+--------------------+--------------------
  #| RID     |name                |mode                |rules               |inheritedRole
---+---------+--------------------+--------------------+--------------------+--------------------
  0|     #4:0|admin               |1                   |{}                  |null
  1|     #4:1|reader              |0                   |{database=2, database.schema=2, database.cluster.internal=2, database.cluster.orole=2, database.cluster.ouser=2, database.class.*=2, database.cluster.*=2, database.command=2, database.hook.record=2}|null
  2|     #4:2|writer              |0                   |{database=2, database.schema=7, database.cluster.internal=2, database.cluster.orole=2, database.cluster.ouser=2, database.class.*=15, database.cluster.*=15, database.command=15, database.hook.record=15}|null
---+---------+--------------------+--------------------+--------------------+--------------------
3 item(s) found. Query executed in 0.045 sec(s).

Found it, it's the #4:2. Not create 2 users with as first role #4:2 (writer):

orientdb> insert into ouser set name = 'luca', status = 'ACTIVE', password = 'luca', roles = [#4:2]
Inserted record 'OUser#5:4{name:luca,password:{SHA-256}D70F47790F689414789EEFF231703429C7F88A10210775906460EDBF38589D90,roles:[1]} v1' in 0,001000 sec(s).

orientdb> insert into ouser set name = 'steve', status = 'ACTIVE', password = 'steve', roles = [#4:2]
Inserted record 'OUser#5:3{name:steve,password:{SHA-256}F148389D080CFE85952998A8A367E2F7EAF35F2D72D2599A5B0412FE4094D65C,roles:[1]} v1' in 0,001000 sec(s).


Create a simple graph as user 'Luca'

Now it's time to disconnect and reconnect to the blog database using the new "luca" user:

orientdb> disconnect
Disconnecting from the database [blog]...OK

orientdb> connect local:../databases/blog luca lucaConnecting to database [local:../databases/blog] with user 'luca'...OK

Now create 2 vertices: a Restaurant and a Pizza:

orientdb> create vertex set label = 'food', name = 'Pizza'
Created vertex 'V#9:0{label:food,name:Pizza,_allow:[1]} v0' in 0,001000 sec(s).

orientdb> create vertex set label = 'restaurant', name = "Dante's Pizza"
Created vertex 'V#9:1{label:restaurant,name:Dante's Pizza,_allow:[1]} v0' in 0,000000 sec(s).

Now connect these 2 vertices with an edge labelled "menu":

orientdb> create edge from #9:0 to #9:1 set label = 'menu'
Created edge '[E#10:0{out:#9:0,in:#9:1,label:menu,_allow:[1]} v1]' in 0,003000 sec(s).

To check if everything is ok execute a select against vertices:

orientdb> select from v
---+---------+--------------------+--------------------+--------------------+--------------------
  #| RID     |label               |name                |_allow              |out
---+---------+--------------------+--------------------+--------------------+--------------------
  0|     #9:0|food                |Pizza               |[1]                 |[1]
  1|     #9:1|restaurant          |Dante's Pizza       |[1]                 |null                |[1]
---+---------+--------------------+--------------------+--------------------+--------------------+--------------------
2 item(s) found. Query executed in 0.034 sec(s).

Create a simple graph as user 'Steve'

Now let's connect to the database using the 'Steve' user and check if there are vertices:

orientdb> disconnect
Disconnecting from the database [blog]...OK

orientdb> connect local:../databases/blog steve steveConnecting to database [local:../databases/blog] with user 'steve'...OK

orientdb> select from v
0 item(s) found. Query executed in 0.0 sec(s).

Ok, no vertices found. Try to create something:

orientdb> create vertex set label = 'car', name = 'Ferrari Modena'
Created vertex 'V#9:2{label:car,name:Ferrari Modena,_allow:[1]} v0' in 0,000000 sec(s).

orientdb> create vertex set label = 'driver', name = 'steve'
Created vertex 'V#9:3{label:driver,name:steve,_allow:[1]} v0' in 0,000000 sec(s).

orientdb> create edge from #9:2 to #9:3 set label = 'drive'
Created edge '[E#10:1{out:#9:2,in:#9:3,label:drive,_allow:[1]} v1]' in 0,002000 sec(s).

Now check the graph just created:

orientdb> select from v
---+---------+--------------------+--------------------+--------------------+--------------------
  #| RID     |label               |name                |_allow              |out
---+---------+--------------------+--------------------+--------------------+--------------------
  0|     #9:2|car                 |Ferrari Modena      |[1]                 |[1]
  1|     #9:3|driver              |steve               |[1]                 |null                |[1]
---+---------+--------------------+--------------------+--------------------+--------------------+--------------------
2 item(s) found. Query executed in 0.034 sec(s).

The "Steve" user doesn't see the vertices and edges creates by other users!
What happen if we try to connect 2 vertices of different users?

orientdb> create edge from #9:2 to #9:0 set label = 'security-test'

Error: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: OCommandSQL [text=create edge from #9:2 to #9:0 set label = 'security-test']
Error: java.lang.IllegalArgumentException: Source vertex '#9:0' does not exist

The partition is totally isolated and OrientDB thinks the vertex doesn't exist while it's present, but invisible to the current user.

TinkerPop Stack

Record Level Security feature is very powerful because acts at low level inside the OrientDB engine. This is why everything works like a charm, even the TinkerPop stack.

Now try to display all the vertices and edges using Gremlin:

orientdb> gremlin g.V
[v[#9:2], v[#9:3]]
Script executed in 0,448000 sec(s).
orientdb> gremlin g.E

e[#10:1][#9:2-drive->#9:3]
Script executed in 0,123000 sec(s).

The same is using other technologies that use the TinkerPop Blueprints: TinkerPop RexterTinkerPop PipesTinkerPop FurnaceTinkerPop Frames and ThinkAurelius Faunus.

This tutorial has been published in http://code.google.com/p/orient/wiki/PartitionedGraphs.

3 comments:

  1. Hi,
    I was looking around for a multi-tenant solution for a SaaS application that didn't make me jump through as many hoops to implement and secure, as most RDBMSs do.

    This partitioned graph approach looks like a really elegant solution, since it completely isolates data between different tenants/customers at the DB level (hence making it immune to bugs or security holes at the application level). I think this makes it a lot easier to pitch to prospective clients who're paranoid about the security and privacy of their data. This approach is as secure as a single-db-per-client system, while keeping administrative overheads to a minimum.

    I can now make my applications 'almost' tenant agnostic (whew!), since the underlying data store will take care of that. I'll just need to write some logic to use a different username for the db connection per tenant.

    Is it possible to still have a set of common data (system-wide settings, like a common payment gateway configuration for example) that can be shared with all the DB users?

    Regards,
    Aditya

    ReplyDelete
    Replies
    1. Hi Aditya, with the solution proposed in this post the database would be the same for all taking care to allow or not the access to the profiled records. In this way you could have some records not profiled at all every profiled record can access. So the answer is yes!

      Delete
  2. Hi Luca, not sure to understand how to do that. For example I would like to create two types of vertex: Node and NodeType (each extends V, and V extends ORestricted). Node vertices must be visible only by the owner and NodeType vertices must be visible by all users. When I follow this post and if I create NodeType with the user admin then these vertices are not visible from other accounts (seems logical for me). Node and NodeType derive from V so they are restricted.

    If I create a new type 'NodeType2' that doesn't extends ORestricted (or V in my example) then It's ok. Instances of NodeType2 are visible from all users. But Node and NodeType are parts of a big graph so Node and NodeType must derive of V. How to do that?

    Thanks

    Laurent

    ReplyDelete

Note: only a member of this blog may post a comment.