What is Apache Sling and how it makes cq5 a great CMS

By | 10:15 Leave a Comment
CQ5 (AEM) is a brilliant content management system . It's different from the rest of the herd because of the framework it's built on, i.e Sling and its repository system (JCR) .

Websites have a tree structure, but in traditional CMS pages are just rows in an RDBMS table and child pages are not really child pages. They just have a special entry to tell which is the parent page. Not in cq5, pages are nodes and child pages are nodes under the parent page, this makes so much sense as a developer to visualize a website. This also leads to url's that make sense, no ugly urls with post id's or content id's as parameters. All this is possible because of Sling.

Sling is a web frame work built to be used with JCR as data store. Sling is built on REST principles. It's the JCR and REST that make Sling a match made in heaven for content management systems.

So whats REST?? I won't enlist the REST principles, there are too many articles out there on REST. From a content management point it means every atomic object of data on your website is a resource, something that can be rendered on it's own. What this means is every page is a resource. Not just the page, the images in it are resources too. So is the slider, navigation and the page body. You get to control how fine the resources are, i.e the title could be a resource or it can be  a part of the page resource. The bottom line is data has to be a resource.

The url tells what resource your looking for. The url can select a page or the slider in the page. The reason you get to have such urls is because of JCR (Java content repository), unlike a traditional RDBMS there are no tables and rows. The JCR is a tree, similar to your file system. They have nodes starting from the root. So your page is node. Under it lie the child pages as child nodes. Apart from the child pages they also have a content node under which the resources of the page (the sliders, page body) are stored as nodes.[You can create a similar effect with traditional RDBMS tables too using apis like web api of asp.net with entity frame work or Jersey api in java. It isn't still as clear as JCR ]

Now that the url uniquely identifies a resource, next rule is that a GET request to the resource returns a representation of the resource (i.e GET to an image returns an image and its rendered by the browser, GET to a page node returns html to be rendered by the browser). You cannot update the values or delete it in a GET call. Now the page node only has data not the HTML. How does a GET request to page return HTML?? Actually there is no hard and fast rule telling a page should return HTML on a GET. It can return JSON or XML or anything of your choice. What happens  on GET to an URL is, the url identifies the resource then the data in the resource is processed by the system (a script renders the node) and returns the value. The GET is not returning the resource , it returning a representation of the resource. So who decides which script renders the node ?? In sling the ResourceResolver does, any GET call to a resource is taken over by it and it uses a script to render the resource. A resource can be rendered by multiple scripts, the script is a resource too. So a resource tells which resource will render  it. In sling its done by sling:resourceType property. This property identifies who will render the resource. There can be multiple scripts for a resourceType, one among them is chosen based on the rules of resource resolution.

[It is this feature that is leveraged to create templates in cq5, you tell which script renders a template. Any page created using the template gets the sling:resourceType value you defined for the template added to it on creation. It is for this reason that when you change resourceType for a template , the pages already created out of it before the change don't reflect it.]

A post to a resource updates the resource(even create one if there is no resource at the url). The parameters in the post request are stored as name value pairs on the node.  How does this happen ? Any post request is intercepted by Sling's default POST servlet and it cycles through the parameters and stores them. You will need to have appropriate rights to do a POST to a resource else the request gets denied.

[This is the backbone behind content authoring in sling, the dialogs you see in the authoring interface are forms. On hitting OK they do a post to the node instantly updating the values.(keep your firebug console ON and edit a dialog) ] So this behavior of GET and POST in Sling makes it perfect for content management.

Now you know how Sling is leveraged in CQ5 let's see how it's all put together. Every component you create is a resource. The data needed to display it is  stored as name value pairs on the node. So if you made a GET request all the way until the components node you will set it just returns the output of your component's JSP.

So why is a GET to a page returning output of all the components on it?? The answer is cq:include and Parsys. You include components using cq:include or parsys. So when a request comes along to the page the cq:include includes response from the components within the response of the page (the response of the node at the path you mentioned in the directive is retrieved and the node [resource] is rendered by the resourceType you mentioned in the directive ). Parsys is cq included too. So when you drag and drop a component on it , it actually creates a node for the component under it. when a request for the page comes along, the script of the parsys is invoked which in turn includes responses of nodes under it. A huge servlet chaining process underneath but works like a charm for content management.

Day software (creators of cq5) designed the Sling framework specially for CQ5 and then made it open source by donating it to Apache. Sling is now an incubator project of Apache.

0 comments:

Post a Comment