Thursday, March 17, 2016

HTTP Caching with ESI


In general, the idea behind caching is to improve web performance by delivering content more quickly and to reduce the load on the origin servers.

Caching static content, such as images, JavaScript and CSS files, and web content that rarely changes is a relatively straightforward process. Cache updates can be handled by regular timeouts or conditional gets etc.

Caching personalised content is generally not possible, because the server’s response to each request for the same resource is different. Techniques such as server side includes (SSI) and edge side includes (ESI) can help to assemble the page.

Lets take a deeper look at ESI and see how it works -
By definition, ESI is a markup language for edge level dynamic web content assembly. The purpose of ESI is to tackle the problem of web infrastructure scaling.

ESI is implemented by different content delivery networks such as Akamai, and by some caching proxy servers such as Varnish. Akamai also adds additional features to the version they support.

So how ESI is implemented?
ESI element tags are inserted into HTML or other text based content during creation. Instead of being displayed to viewers, these ESI tags are directives that instruct an ESI processor to take some action. The XML based ESI tags indicate to the edge-side processing agent the action that needs to be taken to complete the page's assembly.

One simple example of an ESI element is the include tag which is used to include content external to the page. An ESI include tag placed in-line within an HTML document would look like this:


<esi:include src="http://yoursite.com/page1.html" alt="http://beta.yoursite.com/page2.html" onerror="continue">  </esi:include>


In this case the ESI processor would retrieve the src URL, or failing that the alt URL. The ESI system is usually a caching proxy server so it may have a local copy of these files which it can insert without going back to the server. Alternatively, the whole page with the ESI tags may be cached, and only the ESI requests may be made to the origin server. This allows different caching times for different parts of the page, or different degrees of personalisation.



There are four main features in ESI - 
  • Inclusion of page fragments
  • Variables which can be set from cookies or HTTP headers and then used in other ESI statements or written into markup
  • Conditions so that different markup can be used based on variables, for example if a cookie is set or not
  • Error handling for failover if an origin server is unavailable.


Which CDNs support ESI?
ESI is currently supported by CDNs like Akamai, Fastly, CloudFlare and caching proxy servers like Varnish and Squid. Although many do not implement the complete specification, Akamai also adds additional features to the version they support.


Reference and next steps - 


















Wednesday, September 24, 2014

Extending and Developing Cartridges in Endeca 11.x

In this post I have tried to explore Cartridges and Endeca Assembler Application by examining how they work together in a "Hello World" example cartridge.
So without any further delays let’s first understand what is cartridge , cartridge template , cartridge handlers and the structure of a cartridge before making our own custom Cartridges. Further we will also take a close look at Endeca assembler application to understand what it does under the hood.

Cartridges and Cartridge Templates –
A cartridge is a content item with a specific role in your application; for example, a cartridge can map to a GUI component in the front-end application. The Assembler includes a number of cartridges that map to typical
GUI components for example, a Breadcrumbs cartridge, a Search Box cartridge, and a Results List cartridge.
You can create other cartridges that map to other GUI components expected by your business users.

Every cartridge is defined by a template. A cartridge template defines:
  ·    The structure and initial configuration for a content item.
  ·    A set of configurable properties and the associated editors with which the business user can configure them.

Experience Manager instantiates each content item from its cartridge template. This includes any configuration
made by the business user, and results in a content item with instance configuration that is passed to the Assembler.

Cartridge Handlers -
A cartridge handler takes a content item as input, processes it, and returns a content item as output.
The input content item typically includes instance configuration, which consists of any properties specified by a business user using the Experience Manager or Rule Manager tool in Endeca Workbench. The content item
is typically initialized by layering configuration from other sources: your application may include default values, or URL parameters that represent end user selections in the front-end application.
A cartridge handler can optionally perform further processing, such as querying a search engine for data. When processing is finished, the handler returns a completed content item to the application.

Note: Not all cartridges require cartridge handlers. In the case of a content item with no associated cartridge handler, the Assembler returns the unmodified content item.

Cartridge structure -
The template contains two main sections: the <ContentItem> element and the <EditorPanel> element.
The content item is a core concept in Assembler applications that can represent both the configuration model for a cartridge and the response model that the Assembler returns to the client application. A content item is a map of properties, or key-value pairs. The <ContentItem> element in the template defines the prototypical content item and its properties, similar to a class or type definition.



In our example (explained below) template, we defined two string properties named message and messageColor and attached two simple string editors to those properties. The result looks like this in Experience Manager:



A brief note on Endeca Assembler Application -
The Endeca assembler application enables a WEB application to query the MDEX engine and retrieve the appropriate dynamic content based on user navigation state.
The assembler application provides a RESTful web service API that returns results either in JSON or XML format.


What happens at runtime?
The business user creates and configures instances of cartridges in Experience Manager based on a template. During cartridge development you need to create at least one instance of a cartridge for testing.
The Assembler retrieves this configuration at runtime and uses it to build the response model that it returns to the client application.


For any given cartridge, the default behavior is for the Assembler to do no processing on the configuration and simply return the configuration content item as a map of properties. That is, the response object is the same as the configuration object unless specific processing logic is defined in the Assembler for that cartridge.


Cartridge creation workflow
The high-level workflow for creating a basic cartridge is:
    1.     Create a cartridge template and upload it to Endeca Workbench.
    2.     Use Experience Manager to create and configure an instance of the cartridge.
    3.     Add a renderer to the front-end application.

Step 2 is necessary during development in order to have a cartridge instance with which to test. However, once the cartridge is complete, the business user is typically responsible for creating and maintaining cartridge instances in Experience Manager.
In the below sections, we'll see each of these elements of the cartridge in detail.


Hello World cartridge example
Here we will define a new cartridge and use Workbench to configure it to appear on a page.
Follow these steps to create and configure a basic "Hello World" cartridge.
1.   Navigate to the templates directory of your application (Discovery in our case), and create a subdirectory named "HelloWorld." This directory name is the template ID for your template.
For example: D:\Endeca\apps\Discover\config\import\templates\HelloWorld

2.   Create a cartridge template - copy the following into the contents of the file.
Save the file with the name template.xml in the HelloWorld directory which we just created in point 1 above.


<ContentTemplate xmlns="http://endeca.com/schema/content-template/2008"
xmlns:editors="editors" type="SecondaryContent">
<Description>A sample cartridge that can display a simple message.</Description>
<ThumbnailUrl>/ifcr/tools/xmgr/img/template_thumbnails/sidebar_content.jpg</ThumbnailUrl>
<ContentItem>
<Name>Hello cartridge</Name>
<Property name="message">
<String/>
</Property>
<Property name="messageColor">
<String/>
</Property>
</ContentItem>
<EditorPanel>
<BasicContentItemEditor>
<editors:StringEditor propertyName="message" label="Message"/>
<editors:StringEditor propertyName="messageColor"
label="Color"/>
</BasicContentItemEditor>
</EditorPanel>
</ContentTemplate>

3.   Upload the template to Endeca Workbench 
    ·  Open a command prompt and navigate to the control directory of your deployed application, for example, D:\Endeca\apps\Discover\control
    ·  Run the set_templates command


4.   Add the cartridge to a page
    ·   Open Endeca Workbench in a Web browser.
              The default URL for Workbench is http://localhost:8006. The default Username is admin and the default Password is admin
          ·   From the launch page, select Experience Manager
          ·   In the tree on the left, select Search and Navigation Pages under the Content section, then select the Default Page 
          ·   In the Edit Pane on the right, select the right column section from the Content Tree in the bottom left
    ·   Click Add.
                
      The cartridge selector dialog displays

           
           
           ·   Select the HelloWorld cartridge and click OK
           ·   Select the new Hello cartridge from the Content Tree on the left and configure it as shown
           

           ·   Click Save Changes in the upper right of the page.
  
        5.    Try to view the cartridge in the Discover Electronics application.
               In a Web browser, navigate to http://localhost:8006/discover-authoring/


The error displays because we have not yet created a renderer for the Hello cartridge.
Scroll down to the bottom of the page and click the json link to view the serialized Assembler response model that represents the current page.
Oracle recommends that you use a browser or install a plugin that supports native JSON display.
Otherwise, you can download the JSON response as a file.
Alternatively, you can click the xml link to view the same response in XML. In this article, we use the
JSON format when examining the Assembler response.

The following shows the JSON representation of the page with most of the tree collapsed, highlighting the data for the cartridge that we just added.

{
"@type": "ResultsPageSlot",
"name": "Browse Page",
"contentCollection": "Search And Navigation Pages",
"ruleLimit": "1",
"contents": [
{
"@type": "ThreeColumnNavigationPage",
"name": "Default Page",
"title": "Discover Electronics",
"metaKeywords": "camera cameras electronics",
"metaDescription": "Endeca eBusiness reference application.",
"links": [ ],
"header": [ ... ],
"leftColumn": [ ... ],
"main": [ ... ],
"rightColumn": [
{ ... },
{ ... },
{
"@type": "Hello",
"name": "Hello cartridge",
"message": "Hello",
"messageColor": "#FF0000"
}
]
}
],
...

}


In the next section, we'll create a simple renderer that displays the message based on the values configured in Experience Manager.

Adding a basic renderer -
While there is no one way to write rendering code for an application, in this example we'll write a simple JSP renderer for our basic cartridge.
To write a basic "Hello, World" renderer:
      1.  Create a new JSP page (Hello.jsp) and type or copy the following:

<%@page language="java" pageEncoding="UTF-8" contentType="text/html;charset=UTF-8"%>
<%@include file="/WEB-INF/views/include.jsp"%>
<div style="border-style: dotted; border-width: 1px;border-color: #999999; padding: 10px 10px">
<div style="font-size: 150%;
color: ${component.messageColor}">${component.message}
</div>
</div>

       2.  Save the above renderer to
        D:\Endeca\ToolsAndFrameworks\11.1.0\reference\discover-electronics-authoring\WEB-INF\views\desktop\Hello\Hello.jsp
   
        (You need to create “Hello” folder)

       3.  Refresh the Discover Electronics authoring application at
           http://localhost:8006/discover-authoring/ to see the end result :-)





In future posts I would love to explore Optimizing Application URLs and Integrating with the Sitemap Generator. All the above concepts and much more is covered in Endeca Assembler Application Developer's Guide.

Saturday, September 20, 2014

Endeca commerce guided search essentials

Time to explore barebone concepts around Endeca commerce guided search v11.1. It’s vital to understand the key concepts before moving ahead with some of the complex areas of Endeca search. So lets dive in -

Endeca commerce key components -
Oracle Endeca Commerce is comprised of three major components. These components are:
  • Endeca ITL (Information Transformation Layer)
  • Endeca MDEX Engine
  • Endeca Application Tier



Endeca Information Transformation Layer (ITL)
  • Reads your raw source data and manipulates it into a set of Oracle Endeca MDEX Engine indexes. 
  • The ITL consists of the Content Acquisition System (which includes the Endeca CAS Server and Console, the CAS API and the Endeca Web Crawler), and the Data Foundry (which includes data-manipulation programs such as Forge).

Endeca MDEX Engine
  • MDEX engine is the query engine that is the core of guided search. MDEX engine consists of the Indexer (Dgidx), Dgraph and Agraph. 
  • The MDEX Engine loads the indexes generated by the indexing component of the Endeca Information Transformation Layer.
  • Although the Indexer (also known as Dgidx) is installed as part of the MDEX Engine package, in effect it is part of the ITL process.

Endeca Application Tier
  • After the indexes are loaded, the MDEX Engine receives queries from the Endeca Application Tier, executes them against the loaded indexes, and returns the results to the client application. 
  • The Application Tier provides an interface to the MDEX Engine via the Endeca Assembler. The Assembler acts as a language-agnostic interface for aggregating and sending queries to the MDEX Engine, and executing any necessary post-processing on the results.

Note: The ITL components are run offline at intervals that are appropriate for your business requirements.
The MDEX Engine and Endeca Application Tier are both online processes; that is, they must remain running as long as you want clients to have access to your data set.


After knowing the main components now lets go through some of the key terminologies around guided search which we will encounter throughout our Endeca experience. Endeca records, dimensions, and properties store and organize product information, making it accessible to customers through your Guided Search applications.


Endeca Records -
  • Endeca records are the elements of your data set that users navigate to or search for.
  • Records are the fundamental units of data. Endeca records are based on traditional records in a source database. Source database records typically contain information such as the bottles of wine in a wine store, the customer records in a CRM application, or the mutual funds in a fund evaluator.
  • Source database records store this information in one or more key/value pairs, known as properties.
  • A single Endeca record could correspond to any number of source records. For example, suppose that four different source records refer to the same book in different formats: hardcover, paperback, large print, and audio.

Endeca Properties -
  • Endeca properties are the basic attributes of an endeca record.
  • Are usually generated from a record’s source properties, using source property mapping.
  • Consist of key/value pairs (property name/property value).
  • Can be searched and displayed.
Endeca properties often contain more specific information about a record than dimensions. For example, a Price Range dimension is useful for navigation give me all the bottles of wine that cost between $10 and $20 dollars but it is the exact price of each bottle that you want to see when looking at the individual records. A common implementation for this type of application uses a Price Range dimension for navigation and a Price property that is displayed when a bottles record has been located.

Note: Property, dimension, and dimension value names are case sensitive.


Endeca Queries -
  • Oracle Endeca Commerce uses two types of queries: navigation queries and keyword search queries.
  • Navigation queries return a set of records based on application-defined record characteristics (such as wine type or region in an online wine store), plus any follow-on query information.
  • Keyword search queries return a set of records or dimensions based on a user-defined keyword, plus any follow-on query information.
Navigation queries and keyword search queries are complementary. In fact, a keyword search query is a specialized form of navigation query, and the data structures for the results of the two queries are identical: a set of records and follow-on query information.
Users can execute a combination of navigation queries and keyword search queries to navigate to their desired record set in the way that works best for them.


Dimensions and dimension values -
  • Dimensions are logical categories that make it possible to organize your Endeca records into a hierarchical structure that customers can search.
  • A dimension is a collection of related dimension values, organized into a tree. The top-most dimension value in a dimension tree is known as the dimension root. A dimension root always has the same name as its dimension.
  • Each dimension value can have one or more child dimension values; a dimension value with child dimension values is known as a parent dimension value. A child dimension value can have only one parent dimension value. Dimension values that are children of the same parent dimension value are known as sibling dimension values. The dimension values that have no children are known as leaf dimension values.




Dimension hierarchy -
  • Dimension hierarchy gives you additional control over the logical structure used to organize your Endeca records.
  • As the term "dimension tree" implies, dimension values can have parent and child dimension values.
  • A dimension value that has sub-dimension values is the parent of those sub-dimension values. The sub-dimension values themselves are children or child dimension values. 
  • Child dimension values of the same parent dimension at the same level of hierarchy are dimension value siblings.
The following figure illustrates a typical dimension :



Flat Dimensions -
Dimensions that have only one level of hierarchy beneath the dimension root are called flat dimensions




The one parent rule -
It is possible for a dimension value to be simultaneously a child of one dimension value and the parent of other dimension values.



Ancestors -
The ancestors of a given dimension value are all the dimension values between the given dimension value and the dimension root. The parent of the given dimension value is also one of its ancestors.
In the example below, Other and Fortified represent the ancestors for the Sherry dimension value.



Advantages of dimension hierarchy -
You can design dimension hierarchies in ways that reduce the number of follow-on queries that are presented to users as they navigate.

For example, in the flat dimension below, a navigation query on the Wine Type dimension value would return six possible refinement queries, one for each child.


A simple flat dimension such as the one shown in the preceding figure is small enough to navigate through quickly. Larger flat dimensions, however, present too many choices for follow-on queries to be easily navigable.
Organizing data as dimensions reduces this information overload and provides for an easier, more intuitive navigation experience. In the hierarchical example below, the Wine Type dimension value has only three possible refinement queries: Red, White, or Sparkling.


A second reason to use dimension hierarchy is that, by limiting the number of refinement queries, you decrease the amount of time it takes for the MDEX Engine to return its results. Returning query results that contain follow-on information for 20 refinement queries is faster than returning query results that contain information for 2000 refinement queries.

Note: The time that the MDEX Engine takes to process a large flat dimension can be lessened by requesting that the MDEX Engine return only the most important refinements.
After going through above concepts we should next look at Guided Navigation and Using Keyword Search.

Wednesday, September 10, 2014

Oracle ATG commerce integration with Endeca search

     Exploring new software and tool is always exciting and fun way to learn new stuff. With ATG Commerce v11 Oracle has replaced ATG Search with Endeca search.  I have tried to list out all the steps involved in this integration and focused to highlight the issues/errors I faced with their resolutions. So lets dive in -  

   Installation requirements with download links –
a.      Download and install Java 7 or higher,  I am using jdk 7 update 45 , set the environment variable ( JAVA_HOME = D:\Java\jdk1.7.0_45 )
b.     Install application server of your choice – I have used JBoss Enterprise Application Platform - Version 6.1.0.GA  , you can Download from Jboss site.

c.      Installing Oracle XE 11g  - Download from Oracle website and accept all the default settings.
After you install Oracle XE then create following schemas which will be used at the time of ATG configuration via CIM.

create user crs_switching_a identified by crs_switching_a;
grant connect, resource, dba to crs_switching_a;

create user crs_switching_b identified by crs_switching_b;
grant connect, resource , dba to crs_switching_b;

create user crs_publishing identified by crs_publishing;
grant connect, resource,dba to crs_publishing;

create user crs_production identified by crs_production;
grant connect, resource,dba to crs_production;

commit;

You also need JDBC driver jar , this will be needed later when we do Database configuration in CIM. You can download the jar from Oracle site. I am using ojdbc6.jar , you can use higher versions as well.

d.     Go to https://edelivery.oracle.com and download ATG frameworks and Endeca frameworks (version 11.1)
Downloads for ATG –
Oracle Commerce Platform 11.1
Oracle Commerce ReferenceStore 11.1

Downloads for Endeca search –
Oracle Endeca MDEX Engine 6.5.1
Oracle Endeca Platform Services 11.1.0
Oracle Endeca Tools and Frameworks 11.1.0
Oracle Endeca Content Acquisition System 11.1.0

     Installing Endeca Commerce with Experience Manager  -
Goto http://docs.oracle.com/cd/E51272_02/Common.110/pdf/GettingStarted.pdf  and follow instructions given in chapter 3 and chapter 5 in this guide for Endeca installation and configuration.
Please note we are installing Endeca first as its not dependent on ATG installation. We want to make sure Endeca is working fine before moving to ATG installation and configuration.
You can avoid Developer studio installation as its not needed for this integration.

After a successful installation you can see following services installed. Make sure all 3 Endeca services are started as shown below.



You will also see following folders in your d:\Endeca folder after all Endeca related installation.



Time to validate our Endeca installation – If you have followed all the instructions then your Discover Electronics reference is ready to validate.
Visit - http://localhost:8006/discover/  and you should see the below page


Note: To see workbench screen goto http://localhost:8006/

Installing ATG commerce and reference st -
Now we need to install ATG platform first and then reference store . Visit this link http://docs.oracle.com/cd/E41069_01/CRS.11-0/ATGCRSInstall/html/s0101introduction01.html   for step by step installation guide. 

During installation setup will ask for Jboss and Java path – Specifiy your local folder path where you have installed them. Please note the port I have used here is 8180. 


Just to mention, I have avoided using 8080 as Oracle XE is listening to this port so avoid using.
I am using following ports throughout all installations and configurations here.
8180 for Store
8280 for Publishing
8380 for SSO

Dont forget to set environment variable DYNAMO_HOME = D:\ATG\ATG11.1\home

 Configuring ATG environment using CIM  -
Once ATG platform and CRS installation is completed in step # 3 then we need to launch CIM.  Open a command prompt , navigate to D:\ATG\ATG11.1\home\bin  and run cim.bat
Here mainly you are going to configure following items – I suggest you follow CIM response given in any of the two links provided in the Reference section below. It has all the detailed CIM responses.
  • Set the Administrator Password
  • Product Selection – select 9 as it will select all essentials items to install. further  in AddOns choose *** Single Sign On (SSO)  ***
  • Database Configuration
  • Configure OPSS Security
  • Application Assembly and Deployment

After all the above configurations we are left with following –

a.     Start all the server instance like store , publishing and SSO. Use below commands

               D:\jboss-eap-6.1\bin>standalone.bat --server-config=ATGProduction.xml -b 0.0.0.0
               D:\jboss-eap-6.1\bin>standalone.bat --server-config=ATGPublishing.xml -b 0.0.0.0
               D:\jboss-eap-6.1\bin>standalone.bat --server-config=ATGSSO.xml -b 0.0.0.0

b.      You need to perform a full deployment in BCC – follow direct link for this task here - http://docs.oracle.com/cd/E41069_01/CRS.11-0/ATGCRSInstall/html/s0213configuringandrunningafulldeploy01.html
              If you face issues in accessing BCC then check below troubleshoot section for help.

c.       Ideally after successful full deployment the system should build the index. To verify goto http://localhost:8280/dyn/admin/nucleus/atg/commerce/endeca/index/ProductCatalogSimpleIndexingAdmin
and check the indexing job status. If its in PENDING or CANCELLED state then you need to manually invoke “Baseline Index”


d.      Once index is completed we finally need to promote the commerce reference store content . please follow instructions here to invoke promote_content - http://docs.oracle.com/cd/E41069_01/CRS.11-0/ATGCRSInstall/html/s0215promotingthecommercereferencesto01.html

e.     Its time now to finally hit the store URL and explore CRS store - http://localhost:8180/crs/storeus



 Common errors troubleshoot –

  • If you get following error while accessing BCC then read solution –  
      Error /atg/portal/framework/PortalObjectResolver      No root community folder for portal 'default' 
      Solutions – Its due to missing BIZUI data, follow below url and see the last section about 
  • You may encounter ORA-01658 error -
       Error - ORA-01658: unable to create INITIAL extent for segment in tablespace LARGE_ 
       Solution - Its due to the fact that Oracle failed to find sufficient contiguous space to allocate   
       INITIAL extent for segment being created. Visit this link for solution -    
  • If you encounter any connection related error - 
       Error - Caused by: java.sql.SQLException: Listener refused the connection with the following  
        error: ORA-12519, TNS:no appropriate service handler found 
       Solution - I increased timeout setting in the pulishing server configuration, also make sure your  
        DB is up and running.
  • You may not be able to login to BCC until you login to SSO (http://localhost:8380/sso/login) . Make sure you run the SSO instance like you start your store or publishing servers.

           References  -
a.      http://iam-saminda.blogspot.co.uk/2013/06/oracle-ecommerce-atg-102.html  - One of the best written blog which details ATG 10 integration with Endeca search.
b.     http://internetmarketing-readme.pricemaniacs.com/oracle-commerce-v11-step-by-step-cim-responses  - Blog lists all the CIM response for ATG environment configuration using CIM.

Bonus -

> How to check the status of the Endeca Log Server -
Visit - http://localhost:15010/stats

If the Log Server is running, above URL returns a confirmation message containing the file name, number of log entries and number of errors.
If it is not running, you will see browser’s default error message.

OUTPUT -
Endeca log server is running.
log file: D:\Endeca\apps\Discover\logs\logserver_output\LogServer.2014_09_10.17_45_40
number of log entries: 9
number of errors: 0

> Anatomy of an Oracle Commerce Experience Manager Site


Popular Posts