Graph Databases and Relational Databases
Relational database (RD) stores data in the form of tables, whereas Graph database (GD) store data in the form of graphs. Triple stores (TS) are included in Graph Databases. More specifically, TS are a type of NoSQL GD. However, TS differ from NoSQL Graph Databases in several ways. Triple stores have been implemented to store RDF, which is a special kind of graph: a directed labelled graph. NoSQL Graph Databases can store different types of graphs: unlabeled graphs, undirected graphs, weighted graphs, hypergraphs, etc [Diversity]. Major points of difference between two include GD and RD are [1]:
- GD's are occurrence based while as RD's are schema based. This means for conventional applications where before hand we know the schema of stored data we should use RD while as for opposite scenarios we should use GD. GD gives us the flexibility of adding new data and relationships as encountered. Therefore, in RD we have a proper schema/structure of our database before hand, while as in GD no prior structure is fixed.
- Navigation: In GD you start with a root object and then traverse to related objects while as in RD navigation happens through Joins. Navigation via joins is always difficult.
- RD are not good at recursion naturally, but this support is provided with some extensions. In contrast to this, GD are doing great in handling recursion. This is the beauty of graphs.
There are many differences, but at the end it boils down to your application. I think above first point should suffice your main question. Performance issues, always help you to decide whether to go for RD or GD. Some of the GD's include: Neo4j, AllegroGraph, StarDog, OpenLink Virtuoso.
Some important points in regard to databases include:
- Graphs can be implemented in a relational database using foreign keys, but it often needs link tables to model the complexities [2]. A difficulty with implementing triple stores over SQL is that although triples may thus be stored, implementing efficient querying of a graph-based RDF model (e.g., mapping from SPARQL) onto SQL queries is difficult [Jean]. Also, SPARQL, the standard query language used by triple stores, can require a lot of self joins, something RD's are not optimised for [Michael].
- Triple Stores, which are essentially GD but TS don't store data in the form of graphs. Different types of TS on basis of their implementation include [Diversity]:
- Native triple stores: Implemented from scratch. Efficiently store and access RDF data. These include: 4Store, AllegroGraph, BigData, Jena TDB, Sesame, Stardog, OWLIM and uRiKa.
- RDBMS-backed triplestores are built by adding an RDF specific layer to an existing RDBMS. These include: Jena SDB, IBM DB2 and Virtuoso.
- NoSQL Triplestores are recently being investigated as possible storage managers for RDF. For example, CumulusRDF is built on top of Cassandra. The folks at Seevl have implemented a triplestore on top of Redis.
- Among GD's, RDF database systems are only standardised at the moment. These are build upon W3C's Linked Data technology stack [Arto].