Image of boxes with a single different one

When the Alfresco Node Store does not rule

There are times when things just don’t match, sometimes accidently other times by design. Sometimes I feel the mismatch situations hit me more often than others. I guess that is a major reason why I like dynamic languages – they enable you to fix situations where you get stuck otherwise. My customization to the Alfresco Share invitation mail involved the accidental case. This post illustrates a by design situation.

There is no Database ruling all Requirements

For ages relational databases ruled persistence, and for good reasons. One reason is that carefully modeled, normalized data allows you to to postpone search concerns. Another one is that transactions are a good thing to have. Besides, a fair amount of real world data can be decently mapped to a relational model.

Quite recently, the NoSQL movement taught us there are also very good reasons to have other databases. Just like with other problem domains, there simply is no golden hammer in persistence. The Seven Databases Song makes this very clear. :)

Alfresco’s node store is no exception. Just as a relational database, it is quite universal and a good fit for content management persistence. And just as a RDBMS, it cannot address every situation one could come up with.

A Real World DataMismatchException

I was given the requirement to model accounting papers in share (I know, there are super-sophisticated open source solutions available). These papers were represented by pdf documents with (single-valued) properties like a date, number and so on. That is is easy part. But they also have an arbitrary amount of ordered entries consisting of an account, amounts and textual information. How are you supposed to map that to the Alfresco node model ? Sure, you can create a custom entry type and use a many association from the paper. In fact, that was the idea I (naturally) came up with first. But soon I realized that this approach does not make a very good match with Alfresco Forms and I’ll end up writing a lot of code. On top of that, normalizing would not even provide little any value. There was no use case requiring to query entry type content and joins are not supported anyways.

Serialized (C)LOB to the Rescue

As data-normalization provided no value, I decided implementing the paper applying the Serialized (C)LOB pattern, using the JSON representation of entry data, storing it in one entries field. The solution uses a custom component extending Alfresco.component.Base based on YUI DataTable and a custom Freemarker template for the field. This is what it looks like in the form:

Screenshot of account entries table

The component is instantiated in the template just like any other:

(function()
{
  var entries = <#if field.value?has_content>${field.value}<#else>[];
  new Alfresco.EntriesComponent("${controlId}", "${fieldHtmlId}").setOptions(
   {
      entries: entries
   }).setMessages(
      ${messages}
   );
})();

It uses a hidden field to store and submit the serialized data.

<input type="hidden" id="${fieldHtmlId}" name="${field.name}"/>
<!-- more markup -->
 <div id="${controlId}"></div>

The final piece of the puzzle is to set up the DataTable to initialize the value of the hidden field when the tables data changes:


var updateEntriesField = function(oArgs) {
    var rows = [];
    var records = me.widgets.dataTable.getRecordSet().getRecords();
    for (var i=0;i

Other "Node Store does not rule" Situations

The situation exposed here is one example where data structurally does not map well to the node store. Another real world example may be content related data housed in other databases.

References

Andreas Steffan
Pragmatic ? Scientist and DevOps Mind @ Contentreich. Believes in Open Source, the Open Web and Linux. Freelancing in DevOps-, Cloud-, Kubernetes, JVM- and Contentland and speaks Clojure, Kotlin, Groovy, Go, Python, JavaScript, Java, Alfresco and WordPress. Built infrastructure before it was cool. ❤️ Emacs.

12 thoughts on “When the Alfresco Node Store does not rule”

  1. Hi Tjarda,

    thanks for your comment.

    I had a quick look at MetaDB-Connector. From what I understand, it is helpful if what you need is enhancing your nodes with display and search capabilities of data which is stored in a RDBMS.

    My requirements were different. All I needed was storing/retrieving “exceptional” structured data along the way.

  2. Hello Andreas,

    I came across a similar use case in a PoC I did last year. We actually had to do some querying on top of the custom data, but did not have the option to model this via nodes and associations as we expected a rather large volume of instances relating to the same (normalized) value object and did not want to incur the DB overhead for millions of value objects. We solved this via a custom property datatype and enhanced the Alfresco node store to decouple persistence from the UI – the property datatype in combination with a converter handles that transparently on the repository layer.
    Unfortunately, the contribution / enhancement resulting from that PoC is still not prioritized: https://issues.alfresco.com/jira/browse/ALF-10838

  3. Hi Axel,

    thanks for commenting.

    You know what ? I was looking at ALF-10838 coincidentally this morning – before this post went out. :)

    I understand, that your enhancement gives developers freedom to implement property persistence for custom types. I just don’t fully grasp the scope yet.

    Persistence is “easy” as long as write operations are not involved. Things get more complicated as soon as you need distributed transactions / 2pc. Even more so when another database involved does not even have a clue what a transaction is.

    Can you please clarify ?

  4. Hi Jan,

    seems bad luck I did not find it and nobody suggested anything when I asked on stackoverflow.

    Thanks still for pointing it out, though. :)

    The problem with the serialization solution here is that it does not really seem worth factoring out a reusable part. Next time you may find it is a tree you need serialized.

  5. Hello Andreas,

    the scope of ALF-10838 is limited to structured custom datatypes maintained within the Alfresco database / schema. It was/is not intended to be used for some kind of “external” persistence, although it would allow something like this to be implemented.

    The enhancement is meant to help overcome the serialization issue for custom datatypes. Before you had to either stringify values before passing them to the NodeService or live with persistence as byte streams. With it you can have your own, custom and structured value table inside your Alfresco schema, which is hooked into the persistence framework by the simple action of registering a datatype converter. Transactions are still handled by Alfresco and you only have to take care of create / load operations via your custom DAO and MyBatis for most usages. Having your own value table may also help some special use cases, e.g. when you may need to do database queries based on complex values and SOLR is not an option due to lag.

  6. Thanks for clarification, Axel.

    Indeed somewhat similiar (property value) to what I was doing, although it seems you aim to extend uniformly. Serialization one the other hand looks more like a hack at first sight, even though there are very valid use cases. It is simple, flexible and fine when you don’t have specific search or storage requirements. That obviously did not apply for you.

    Given the fact that ALF-10838 is idle for half a year, I wonder where (if any) the catch is.

    Does it somehow break uniformity ?
    Does it play well with FormsService and companions ?
    Is there any reason why alfresco prohibits “messing in their schema” ?

  7. Hi Andreas,
    you shouldn’t interpret a contribution ticket that is idle as “prohibition”. I think it isn’t currently prio 1. Also there are no votes on this ticket…
    Cheers, jan

  8. I don’t interpret an idle contribution ticket as a prohibition. I am not even saying that extensions must not make changes to the alfresco schema. In fact, I was just wondering.

    I think it would be nice if the contribution process could be made a little more transparent. Just look at:

    https://issues.alfresco.com/jira/secure/IssueNavigator.jspa?mode=hide&requestId=10906

    To me, this looks like it is fairly common that it takes a long time for at least some contributions until they either get accepted or rejected.

    With special regards to ALF-10838, I personally think it addresses an edge case and hence we don’t see a lot of votes. Still I think half a year should be enough to process the issue – assuming there is no catch and it does not open a can of worms. Risk and impact seem moderate to me.

    PS: If I were Alfresco, and you were an attorney, I would definitely make sure to hire you. ;)

  9. Just watched the Alfresco Hangout and learned that Chris Paul pushed the “JSON stored in property value” approach to the next general level with alfraca. There may be valid use cases, but to me it looks pretty close to doing it wrong. If what you have is structured data, you should try using a database supporting this and not diy.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert