Kotan Code 枯淡コード

In search of simple, elegant code

Menu Close

Modeling Arrays and Complex, Nested Objects in Neo4j

Today I was messing around with Neo4j, as I am wont to do, and I ran across a modeling scenario that I hadn’t expected. My previous experience with NoSQL databases has been with MongoDB. I’ve used Mongo for multiple projects, including a MUD and an MMORPG server both written in Akka and Scala.

When using MongoDB as a backing store for my enterprise application, I’ve been storing multi-level objects in there without much concern. I’ve got a root object that contains an array of nested objects which, in turn, actually contain even more nested objects. The structure is a few levels deep, but using techniques like Play Framework’s JSON inception, it makes serializing an de-serializing to and from Scala case classes pretty easy. All in all, it’s been working out quite well.

When looking at how to store information that I might otherwise put in a NoSQL store in Neo4j to take advantage of its graph query and traversal capabilities, I noticed that Neo4j, while “speaking JSON”, doesn’t do so at quite the robust level that MongoDB does. In short, you can’t store arrays or nested objects in a Neo4j node or relationship. Now, before you complain about this, there are a pile of good reasons for this that I could really only begin to touch the surface of if I tried to explain them. Bottom line is that nodes and relationships in Neo4j have properties, which are name-value pairs, not recursive name-value maps like you find in Mongo.

So, with that restriction in place, how do we model complex, nested objects and arrays in Neo4j?

First we can take a look at why we need an array. In many cases in my model, the arrays of nested objects on a root object were actually modeling a parent-child type relationship where the children had their own set of properties. So, let’s say you want a node in your graph to be a WellArmedZombie and that particular zombie needs an array of weapons. If that zombie owns or contains or is responsible for those weapons, you can split that single node into many, one node for the core WellArmedZombie, one node for each of its weapons, and a relationship (probably -[:WEAPON]->) between the zombie and each weapon.

In my own exercise attempting to remodel a MongoDB database as a Neo4j database, nested objects were very easy to convert. In fact, in almost all cases, I was able to take the field name that referred to the nested item and turn it into a relationship pointing to a related node containing the properties that used to belong to the nested item.

Let’s say this same WellArmedZombie node, in Mongo, had a property called armor which was a nested JSON object containing properties like strengthmaterial, and color. It is pretty easy to switch that into Neo4j where we have a relationship that looks like (zombie)-[:ARMOR]->(armor).

So, the one lesson I learned today was really that thinking relationally requires some unlearning before you can really take advantage of a NoSQL database model. Further, thinking in a graph requires you to unlearn some pretty standard conventions and best practices that are a part of thinking in documents rather than nodes and edges. I realize that as we discover new tools, we often feel like the shiny new tool will fix all problems. It doesn’t, and I’ve got a couple models/domains I work with that I don’t think are good candidates for Neo4j. I have another model that I’ve used where I think a hybrid Neo4j/Mongo approach might be appropriate – put all the information you need for queries, traversals, and quick summary display into the Neo4j database and put the large, bulky (heavily nested and composed) data into Mongo for supplementary query.

Regardless, I am pretty damn excited to be living and working in a time where the kind of computing power that Neo4j offers is pretty much ubiquitous, free, and readily available on my machine and in the cloud. It’s a great time to be a developer.