Import from a Relational Database

Relational databases typically query and manipulate data with SQL. Given that OrientDB supports a subset of SQL, it is relatively straightfoward to import data from a Relational databases to OrientDB. You can manage imports using the Java API, OrientDB Studio or the OrientDB Console. The examples below use the Console.

This guide covers importing into the Document Model. Beginning with version 2.0, you can import into the Graph Model using the ETL Module. From version 1.7.x you can still use ETL by installing it as a separate module

For these examples, assume that your Relational database, (referred to as reldb in the code), contains two tables: Post and Comment. The relationship between these tables is one-to-many.

reldb> SELECT * FROM post;

+----+----------------+
| id | title          |
+----+----------------+
| 10 | NoSQL movement |
| 20 | New OrientDB   |
+----+----------------+


reldb> SELECT * FROM comment;

+----+--------+--------------+
| id | postId | text         |
+----+--------+--------------+
|  0 |   10   | First        |
|  1 |   10   | Second       |
| 21 |   10   | Another      |
| 41 |   20   | First again  |
| 82 |   20   | Second Again |
+----+--------+--------------+

Given that the Relational Model doesn't use concepts from Object Oriented Programming, there are some things to consider in the transition from a Relational database to OrientDB.

  • In Relational databases there is no concept of class, so in the import to OrientDB you need to create on class per table.

  • In Relational databases, one-to-many references invert from the target table to the source table.

    Table POST    <- (foreign key) Table COMMENT
    

    In OrientDB, it follows the Object Oriented Model, so you have a collection of links connecting instances of Post and Comment.

    Class POST ->* (collection of links) Class COMMENT
    

Exporting Relational Databases

Most Relational database management systems provide a way to export the database into SQL format. What you specifically need from this is a text file that contains the SQL INSERT commands to recreate the database from scratch. For example,

When you run this utility on the example database, it produces an .sql file that contains the exported SQL of the Relational database.

DROP TABLE IF EXISTS post;
CREATE TABLE post (
id INT(11) NOT NULL AUTO_INCREMENT,
title VARCHAR(128),
PRIMARY KEY (id)
);

DROP TABLE IF EXISTS comment;
CREATE TABLE comment (
id INT(11) NOT NULL AUTO_INCREMENT,
postId INT(11),
text TEXT,
PRIMARY KEY (id),
CONSTRAINT `fk_comments`
    FOREIGN KEY (`postId` )
    REFERENCES `post` (`id` )
);

INSERT INTO POST (id, title) VALUES( 10, 'NoSQL movement' );
INSERT INTO POST (id, title) VALUES( 20, 'New OrientDB' );

INSERT INTO COMMENT (id, postId, text) VALUES( 0, 10, 'First' );
INSERT INTO COMMENT (id, postId, text) VALUES( 1, 10, 'Second' );
INSERT INTO COMMENT (id, postId, text) VALUES( 21, 10, 'Another' );
INSERT INTO COMMENT (id, postId, text) VALUES( 41, 20, 'First again' );
INSERT INTO COMMENT (id, postId, text) VALUES( 82, 20, 'Second Again' );

Modifying the Export File

Importing from the Relational database requires that you modify the SQL file to make it usable by OrientDB. In order to do this, you need to open the SQL file, (called export.sql below), in a text editor and modify the commands there. Once this is done, you can execute the file on the Console using batch mode.

Database

In order to import a data into OrientDB, you need to have a database ready to receive the import. Note that the example export.sql file doesn't include statements to create the database. You can either create a new database or use an existing one.

Using New Databases

In creating a database for the import, you can either create a volatile in-memory database, (one that is only available while OrientDB is running), or you can create a persistent disk-based database. For a persistent database, you can create it on a remote server or locally through the PLocal mode.

The recommended method is PLocal, given that it offers better performance on massive inserts.

  • Using the embedded Plocal mode:

    $ vim export.sql
    
    CREATE DATABASE PLOCAL:/tmp/db/blog admin_user admin_passwd PLOCAL DOCUMENT
    

    Here, the CREATE DATABASE command creates a new database at /tmp/db/blog.

  • Using the Remote mode:

    $ vim export.sql
    
    CREATE DATABASE REMOTE:localhost/blog root_user dkdf383dhdsj PLOCAL DOCUMENT
    

    This creates a database at the URL http://localhost/blog.

NOTE: When you create remote databases, you need the server credentials to access it. The user root and its password are stored in the $ORIENTDB_HOME/config/orientdb-server-config.xml configuration file.

Using Existing Databases

In the event that you already have a database set up and ready for the import, instead of creating a database add a line that connects to that databases, using the CONNECT command.

  • Using the embedded PLocal mode:

    $ vim export.sh
    
    CONNECT PLOCAL:/tmp/db/blog admin_user admin_passwd
    

    This connects to the database at /tmp/db/blog.

  • Using the Remote mode:

    $ vim export.sql
    
    CONNECT REMOTE:localhost/blog admin_user admin_passwd
    

    This connects to the database at the URL http://localhost/blog.

Declaring Intent

In the SQL file, after you create or connect to the database, you need to declare your intention to perform a massive insert. Intents allow you to utilize automatic tuning OrientDB for maximum performance on particular operations, such as large inserts or reads.

$ vim export.sh
...
DECLARE INTENT MASSIVEINSERT

Creating Classes

Relational databases have no parallel to concepts in Object Oriented programming, such as classes. Conversely, OrientDB doesn't have a concept of tables in the Relational sense.

Modify the SQL file, changing CREATE TABLE statements to CREATE CLASS commands:

$ vim export.sql
...
CREATE CLASS Post
CREATE CLASS Comment

NOTE: In cases where your Relational database was created using Object Relational Mapping, or ORM, tools, such as Hibernate or Data Nucleus, you have to rebuild the original Object Oriented Structure directly in OrientDB.

In the Relational database, the relationship between the post and comment was handled through foreign keys on the id fields. OrientDB handles relationships differently, using links between two or more records of the Document type.

By default, the CREATE LINK command creates a direct relationship in your object model. Navigation goes from Post to Comment and not vice versa, which is the case for the Relational database. You'll need to use the INVERSE keyword to make the links work in both directions.

Add the following line after the INSERT statements.

$ vim export.sql
...
CREATE LINK comments TYPE LINKSET FROM comment.postId TO post.id INVERSE

Remove Constraints

Unlike how Relational databases handle tables, OrientDB does not require you to create a strict schema on your classes. The properties on each class are defined through the INSERT statements. That is, id and title on Post and id, postId and text on Comment.

Given that you created a link in the above section, the property postId is no longer necessary. Instead of modifying each INSERT statement, you can use the UPDATE command to remove them at the end:

$ vim export.sql
...
UPDATE comment REMOVE postId

Bear in mind, this is an optional step. The database will still function if you leave this field in place.

Expected Output

When you've finished, remove any statements that OrientDB does not support. With the changes above this leaves you with a file similar to the one below:

$ cat export.sql

CONNECT plocal:/tmp/db/blog admin admin

DECLARE INTENT MASSIVEINSERT

CREATE CLASS Post
CREATE CLASS Comment

INSERT INTO Post (id, title) VALUES( 10, 'NoSQL movement' )
INSERT INTO Post (id, title) VALUES( 20, 'New OrientDB' )

INSERT INTO Comment (id, postId, text) VALUES( 0, 10, 'First' )
INSERT INTO Comment (id, postId, text) VALUES( 1, 10, 'Second' )
INSERT INTO Comment (id, postId, text) VALUES( 21, 10, 'Another' )
INSERT INTO Comment (id, postId, text) VALUES( 41, 20, 'First again' )
INSERT INTO Comment (id, postId, text) VALUES( 82, 20, 'Second Again' )

CREATE LINK comments TYPE LINKSET FROM Comment.postId TO Post.id INVERSE
UPDATE Comment REMOVE postId

Importing Databases

When you finish modifying the SQL file, you can execute it through the Console in batch mode. This is done by starting the Console with the SQL file given as the first argument.

$ $ORIENTDB_HOME/bin/console.sh export.sql

When the OrientDB starts, it executes each of the commands given in the SQL files, creating or connecting to the database, creating the classes and inserting the data from the Relational database. You now have a working instance of OrientDB to use.

Using the Database

You now have an OrientDB Document database where relationships are direct and handled without the use of joins.

  • Query for all posts with comments:

    orientdb> SELECT FROM Post WHERE comments.size() > 0
    
  • Query for all posts where the comments contain the word "flame" in the text property:

    orientdb> SELECT FROM Post WHERE comments CONTAINS(text 
              LIKE '%flame%')
    
  • Query for all posts with comments made today, assuming that you have added a date property to the Comment class:

    orientdb> SELECT FROM Post WHERE comments CONTAINS(date > 
              '2011-04-14 00:00:00')
    

For more information, see