Student: Johann Werner Köck (2023)
Supervisor: a.Univ.-Prof. DI Dr. Wolfram Wöß
Abstract
In numerous of today's data-intensive applications highly connected data play a central role. While the relational data model has been dominant for decades in handling transactional data, it faces challenges when dealing with data characterized by complex relationships. In contrast, the graph model offers a more inherent representation of such data, considering both entities and relationships between them as first-class citizens using nodes and edges. Although it is generally possible to represent graphs in relational databases, this approach often leads to cumbersome operations and poor runtime performance when analyzing them. Consequently, a new type of database, known as graph databases, has emerged with the specific purpose of efficiently managing highly connected data by directly representing them as graphs. In this context, the database market has experienced the successful introduction and establishment of new products and players. In various publications, numerous references present compelling evidence that highlights the superiority of graph databases over relational databases when it comes to analyzing graph data. Comparisons in this regard typically focus on native graph databases, as they currently dominate the field and are widely used in practice. In recent years, additionally, well-known relational database products have started offering graph-related features, probably in response to the increasing demand in the growing market for graph data processing. Naturally, the related announcements and product descriptions regarding these enhancements are promising, although it is questionable whether the integration of the relational model and the graph model can be successful. Currently, however, there is no systematic comparison of native graph databases and graph-related features on relational databases to be found in the literature. This work aims to address the lack by, firstly, proposing a catalogue of criteria to evaluate both native graph databases and graph-related features on relational databases. In this regard, the intention is to conduct a comprehensive investigation, emphasizing aspects such as the implementation of the graph model and the graph analytical capabilities. Secondly, the catalog of criteria is applied to a concrete example, evaluating the graph-related features of the Microsoft SQL Server database.
In this section, Neo4j, one of the prominent native graph database products, serves as a reference for comparison. On the one hand, this work enables a discussion regarding the criteria for comparison. On the other hand, it focuses on current developments in the field of databases and ultimately guides users in selecting the appropriate product for specific use cases.