<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>emphess .NET &#187; MongoDB</title>
	<atom:link href="http://www.emphess.net/tag/mongodb/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.emphess.net</link>
	<description>Freshly Draught Code</description>
	<lastBuildDate>Fri, 11 Nov 2011 11:57:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>The Object-Document Mismatch: MongoDB and db4o with Linq</title>
		<link>http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/</link>
		<comments>http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/#comments</comments>
		<pubDate>Wed, 05 May 2010 15:43:07 +0000</pubDate>
		<dc:creator>Christoph Menge</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[db4o]]></category>
		<category><![CDATA[linq]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://www.emphess.net/?p=232</guid>
		<description><![CDATA[Rob Conery recently wrote about using MongoDB with Linq. I was really intrigued by the fact that you can use elegant, readable, type-safe Linq-queries to access MongoDB document database. To be honest, I had no clue what an object database really is, but when it speaks Linq it must be cool, I thought. So I [...]]]></description>
			<content:encoded><![CDATA[<p>Rob Conery recently <a href="http://blog.wekeroad.com/2010/03/04/using-mongo-with-linq">wrote about using MongoDB with Linq</a>. I was really intrigued by the fact that you can use elegant, readable, type-safe Linq-queries to access <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB document database</a>. To be honest, I had <strong>no clue</strong> what an <strong>object database</strong> really is, but when it speaks Linq it must be cool, I thought. </p>
<p>So I dug a bit deeper into MongoDB and NoRM, which is a nifty C# driver for MongoDB developed by <a href="http://andrewtheken.com/">Andrew Theken</a> and several others. You might want to <a href="http://github.com/atheken/NoRM">grab a copy at github</a>, where you can see how incredibly active the project is! Now, back to the evaluation: what is the best way to evaluate a database? Of course, build <a href="http://www.backlink-tracker.net/">a live product using it</a>!</p>
<p>Since I was completely abusing db4o for said project (in fact, I am storing something you&#8217;d call documents there), I decided that this would be a great candidate for a migration. So now we&#8217;re migrating from an object database to a document database and from an ACID database to a NoSQL solution.</p>
<p>MongoDB is considered a <a href="http://nosql-database.org/">NoSQL</a> solution, <a href="http://www.kellblog.com/2010/04/11/yes-virginia-marklogic-is-a-nosql-system/">while db4o is considered &#8216;soft&#8217; NoSQL</a> &#8211; see Stefan&#8217;s comment at the bottom. Why the distinction &#8211; both do not rely on, support or use SQL whatsoever?! But then again, that is not what the Term &raquo;<em>NoSQL</em>&laquo; is all about. It&#8217;s probably one of the most misleading terms ever coined and perhaps it should read &raquo;Not ACID&laquo; or just &raquo;Persistence without Prejudgement, <em>PwoP</em>&laquo;. db4o makes ACID guarantees, comes from an embedded background and offers single-server durability while MongoDB is made for the net, does not have single-server durability, supports MapReduce and is driven by JavaScript. </p>
<p>Hell, they couldn&#8217;t be more different. But then again, I can access both using identical interfaces:</p>
<pre class="brush: csharp">
// Linq is understood by both, so you could use the lines in both:
var u = (from Note n in container
         where n.Text == null select n).ToList();
var v = session.Query&lt;Tag&gt;().Where(p =&gt; p.Name != null);
// Store an object graph to Mongo using NoRM:
session.Add(testNote);
// Store an object graph to db4o:
container.Store(testNote);
</pre>
<p>Don&#8217;t be fooled &#8211; although these lines could all be used with MongoDB or db4o and all of them could even be used with the very same classes, they are still fundamentally different in behavior. Also, for anything but the most simple problems, <strong>you can&#8217;t just persist the same domain models</strong>.</p>
<h2>What is a document now?</h2>
<p>A document is not just an unstructured piece of data. It&#8217;s not a BLOB. Instead, an instance of a class, <strong>plus all it refers to</strong>, could be a document:</p>
<pre class="brush: csharp">
class UniqueIdObject
{
    public Guid Id { get; private set; }
    public UniqueIdObject() { Id = Guid.NewGuid(); }
}

class Report : UniqueIdObject
{
    public Report() : base() { Tags = new List&lt;Tag&gt;(); }
    public string Text { get; set; }
    public List&lt;Tag&gt; Tags { get; set; }
}

class Tag : UniqueIdObject
{
    public string Name { get; set; }
}
</pre>
<p>Now we might want to store a note in the database, and the code needed is just</p>
<pre class="brush: csharp">
using (NoRMSession session = new NoRMSession())
{
    Report newReport = new Report() { Text = &quot;Hello World, MongoDB!&quot; };
    newReport.Tags.Add(new Tag() { Name = &quot;Tag 1&quot; });
    newReport.Tags.Add(new Tag() { Name = &quot;Tag 2&quot; });
    session.Add(newReport);
}
</pre>
<p>That&#8217;s it! Wow! Of course, we haven&#8217;t taken care of indexation and stuff and since MongoDB is schemaless (or better, has a dynamic schema) we need to do that in code. But still, this is essentially all you need.</p>
<p>The important thing now is: What happens to those little `Tags` we deliberately put into a separate class? Now the mapper hides a bit of truth from us, because MongoDB works on so-called &#8220;Collections&#8221;. <code>session.Add(newReport)</code> will be <code>session.Add&lt;Report&gt;(newReport)</code>, which will in turn put the <code>newReport</code> object <strong>into the Report-Collection!</strong> So the object graph, as it is, will be serialized into the Report collection, <em>including</em> our little Tag objects!</p>
<p>Each item with an orange border is an &#8216;atomic&#8217; item in its respective data store:</p>
<div id="attachment_242" class="wp-caption alignleft" style="width: 310px"><a href="http://www.emphess.net/wp-content/uploads/2010/05/db4o-graph.png"><img src="http://www.emphess.net/wp-content/uploads/2010/05/db4o-graph-300x184.png" alt="" title="db4o-graph" width="300" height="184" class="size-medium wp-image-242" /></a><p class="wp-caption-text">db4o serialized graph</p></div>
<p><div id="attachment_241" class="wp-caption alignleft" style="width: 310px"><a href="http://www.emphess.net/wp-content/uploads/2010/05/mongo.png"><img src="http://www.emphess.net/wp-content/uploads/2010/05/mongo-300x200.png" alt="" title="mongo" width="300" height="200" class="alignright size-medium wp-image-241" /></a><p class="wp-caption-text">mongo serialized document</p></div><br />
</p>
<div style="clear: both;"></div>
<p>Let&#8217;s naively try to fetch all tags:</p>
<pre class="brush: csharp">
var v = session.Query&lt;Tag&gt;().Where(p =&gt; p.Name != null);
</pre>
<p>This does not work, <code>v</code> is <code>null</code> because there is no tag collection! Instead, the tag lists are <strong>part of the reports</strong> we put into the <code>Report</code> collection. Note that this would work in db4o, because db4o will store references as references, while &#8216;documents&#8217; store the contained data instead of references. This is beautifully simple, but it&#8217;s also very different from what you might expect and it has lots of implications for your object structure.</p>
<h2>Thoroughly think through your schemaless schema</h2>
<p>MongoDB is made for <em>scalability and simplicity</em>, so it does not care for our foolish approach to fetch tags directly. There are ways to access that data directly, however. We could use a deep-graph query or write a javascript Map/Reduce instruction, but that is a bit out of scope right now. What&#8217;s more important is that it calls for <strong>changes to our domain model objects</strong>: If we really want to store a reference, we need to do so manually, in ye olde sql-way, by storing the associated Id instead of the object. Of course, that makes deserialization a bit more complicated because the object we now retrieve from the database aren&#8217;t ready-to-use as they come.</p>
<p>However, automating that process will induce several complications, among them the need to <strong>handle cyclic references</strong>, a concept for fetching or activating objects on-the-fly, called <strong>Transparent Activation</strong> in the db4o world and making sure we&#8217;re not inducing a massive performance hit along the way.</p>
<p>Also, updating objects can be painful. Suppose we stored a list of <code>Reports</code> for each user. Now we might want to put the list of reports directly into the user object, <strong>or</strong> store a list of <code>ReportIds</code> for each users and put the Reports into their very own ReportCollection. As usual, <strong>there is no silver bullet</strong>, so this decision depends on the specific needs of the application, but whatever decision we take will not be visible to users of the resulting objects. In fact, it leads to some unwanted <strong>strong coupling</strong>:</p>
<pre class="brush: csharp">
class ReportService
{
  private NoRMSession _session;

  // ...
  public void UpdateReportDetail()
  {
    this.ReportDetailXY = ComplicatedCalculation();
    this.LastChanged = DateTime.UtcNow;

    // If the reports are in their very own Report Collection, this
    // is fine. However, if they are contained as lists in the user
    // who owns them, we&#039;re in trouble and this will fail!
    _session.Update(this);
  }
}
</pre>
<p>This might not be a big problem for <strong>really small applications</strong>, and if you really need performance (why else would you choose a NoSQL system?) you have to fine-tune your objects anyways. Still, I have the feeling that there is some space for improvement here, and a <strong>basic</strong> wrapper could help in overcoming some of the issues raised. Some basic ideas on how to approach this will follow shortly.</p>
<h2>Wrapping it up</h2>
<p>Obviously, I&#8217;m comparing Apples and Oranges here: db4o is made to <strong>make persistence easier</strong>, especially with complex domain model objects. db4o, being an <strong>object database</strong>, behaves exactly the way you&#8217;d expect objects under serialization to behave, but that makes it quite complex. MongoDB is focused on <strong>simplicity and scalability</strong>. Through the document concept, you gain simplicity on the database, administration and wrapper (driver)-side, but you have to struggle with a slight <strong>impedance mismatch</strong> in code, especially against Linq, again.</p>
<div class="tweetthis" style="text-align:right;"><p> <a rel="nofollow" class="tt" href="http://twitter.com/intent/tweet?text=The+Object-Document+Mismatch%3A+MongoDB+and+db4o+with+Linq+http%3A%2F%2Femphess.net%2F%3Fp%3D232" title="Post to Twitter"><img class="nothumb" src="http://www.emphess.net/wp-content/plugins/tweet-this/icons/en/twitter/tt-twitter2.png" alt="Post to Twitter" /></a> <a rel="nofollow" class="tt" href="http://delicious.com/post?url=http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/&amp;title=The+Object-Document+Mismatch%3A+MongoDB+and+db4o+with+Linq" title="Post to Delicious"><img class="nothumb" src="http://www.emphess.net/wp-content/plugins/tweet-this/icons/en/delicious/tt-delicious.png" alt="Post to Delicious" /></a> <a rel="nofollow" class="tt" href="http://digg.com/submit?url=http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/&amp;title=The+Object-Document+Mismatch%3A+MongoDB+and+db4o+with+Linq" title="Post to Digg"><img class="nothumb" src="http://www.emphess.net/wp-content/plugins/tweet-this/icons/en/digg/tt-digg.png" alt="Post to Digg" /></a> <a rel="nofollow" class="tt" href="http://www.facebook.com/share.php?u=http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/&amp;t=The+Object-Document+Mismatch%3A+MongoDB+and+db4o+with+Linq" title="Post to Facebook"><img class="nothumb" src="http://www.emphess.net/wp-content/plugins/tweet-this/icons/en/facebook/tt-facebook.png" alt="Post to Facebook" /></a></p></div>]]></content:encoded>
			<wfw:commentRss>http://www.emphess.net/2010/05/05/the-object-document-mismatch-mongodb-and-db4o-with-linq/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

