Time to explore barebone concepts around Endeca commerce guided search v11.1. It’s vital to understand the key concepts before moving ahead with some of the complex areas of Endeca search. So lets dive in -
Endeca commerce key components -
Oracle Endeca Commerce is comprised of three major components. These components are:
- Endeca ITL (Information Transformation Layer)
- Endeca MDEX Engine
- Endeca Application Tier
Endeca Information Transformation Layer (ITL)
- Reads your raw source data and manipulates it into a set of Oracle Endeca MDEX Engine indexes.
- The ITL consists of the Content Acquisition System (which includes the Endeca CAS Server and Console, the CAS API and the Endeca Web Crawler), and the Data Foundry (which includes data-manipulation programs such as Forge).
Endeca MDEX Engine
- MDEX engine is the query engine that is the core of guided search. MDEX engine consists of the Indexer (Dgidx), Dgraph and Agraph.
- The MDEX Engine loads the indexes generated by the indexing component of the Endeca Information Transformation Layer.
- Although the Indexer (also known as Dgidx) is installed as part of the MDEX Engine package, in effect it is part of the ITL process.
Endeca Application Tier
- After the indexes are loaded, the MDEX Engine receives queries from the Endeca Application Tier, executes them against the loaded indexes, and returns the results to the client application.
- The Application Tier provides an interface to the MDEX Engine via the Endeca Assembler. The Assembler acts as a language-agnostic interface for aggregating and sending queries to the MDEX Engine, and executing any necessary post-processing on the results.
Note: The ITL components are run offline at intervals that are appropriate for your business requirements.
The MDEX Engine and Endeca Application Tier are both online processes; that is, they must remain running as long as you want clients to have access to your data set.
After knowing the main components now lets go through some of the key terminologies around guided search which we will encounter throughout our Endeca experience. Endeca records, dimensions, and properties store and organize product information, making it accessible to customers through your Guided Search applications.
Endeca Records -
- Endeca records are the elements of your data set that users navigate to or search for.
- Records are the fundamental units of data. Endeca records are based on traditional records in a source database. Source database records typically contain information such as the bottles of wine in a wine store, the customer records in a CRM application, or the mutual funds in a fund evaluator.
- Source database records store this information in one or more key/value pairs, known as properties.
- A single Endeca record could correspond to any number of source records. For example, suppose that four different source records refer to the same book in different formats: hardcover, paperback, large print, and audio.
Endeca Properties -
- Endeca properties are the basic attributes of an endeca record.
- Are usually generated from a record’s source properties, using source property mapping.
- Consist of key/value pairs (property name/property value).
- Can be searched and displayed.
Endeca properties often contain more specific information about a record than dimensions. For example, a Price Range dimension is useful for navigation — give me all the bottles of wine that cost between $10 and $20 dollars — but it is the exact price of each bottle that you want to see when looking at the individual records. A common implementation for this type of application uses a Price Range dimension for navigation and a Price property that is displayed when a bottle’s record has been located.
Note: Property, dimension, and dimension value names are case sensitive.
Endeca Queries -
- Oracle Endeca Commerce uses two types of queries: navigation queries and keyword search queries.
- Navigation queries return a set of records based on application-defined record characteristics (such as wine type or region in an online wine store), plus any follow-on query information.
- Keyword search queries return a set of records or dimensions based on a user-defined keyword, plus any follow-on query information.
Navigation queries and keyword search queries are complementary. In fact, a keyword search query is a specialized form of navigation query, and the data structures for the results of the two queries are identical: a set of records and follow-on query information.
Users can execute a combination of navigation queries and keyword search queries to navigate to their desired record set in the way that works best for them.
Dimensions and dimension values -
- Dimensions are logical categories that make it possible to organize your Endeca records into a hierarchical structure that customers can search.
- A dimension is a collection of related dimension values, organized into a tree. The top-most dimension value in a dimension tree is known as the dimension root. A dimension root always has the same name as its dimension.
- Each dimension value can have one or more child dimension values; a dimension value with child dimension values is known as a parent dimension value. A child dimension value can have only one parent dimension value. Dimension values that are children of the same parent dimension value are known as sibling dimension values. The dimension values that have no children are known as leaf dimension values.
Dimension hierarchy -
- Dimension hierarchy gives you additional control over the logical structure used to organize your Endeca records.
- As the term "dimension tree" implies, dimension values can have parent and child dimension values.
- A dimension value that has sub-dimension values is the parent of those sub-dimension values. The sub-dimension values themselves are children or child dimension values.
- Child dimension values of the same parent dimension at the same level of hierarchy are dimension value siblings.
The following figure illustrates a typical dimension :
Flat Dimensions -
Dimensions that have only one level of hierarchy beneath the dimension root are called flat dimensions
The one parent rule -
It is possible for a dimension value to be simultaneously a child of one dimension value and the parent of other dimension values.
Ancestors -
The ancestors of a given dimension value are all the dimension values between the given dimension value and the dimension root. The parent of the given dimension value is also one of its ancestors.
In the example below, Other and Fortified represent the ancestors for the Sherry dimension value.
Advantages of dimension hierarchy -
You can design dimension hierarchies in ways that reduce the number of follow-on queries that are presented to users as they navigate.
For example, in the flat dimension below, a navigation query on the Wine Type dimension value would return six possible refinement queries, one for each child.
A simple flat dimension such as the one shown in the preceding figure is small enough to navigate through quickly. Larger flat dimensions, however, present too many choices for follow-on queries to be easily navigable.
Organizing data as dimensions reduces this information overload and provides for an easier, more intuitive navigation experience. In the hierarchical example below, the Wine Type dimension value has only three possible refinement queries: Red, White, or Sparkling.
A second reason to use dimension hierarchy is that, by limiting the number of refinement queries, you decrease the amount of time it takes for the MDEX Engine to return its results. Returning query results that contain follow-on information for 20 refinement queries is faster than returning query results that contain information for 2000 refinement queries.
Note: The time that the MDEX Engine takes to process a large flat dimension can be lessened by requesting that the MDEX Engine return only the most important refinements.
After going through above concepts we should next look at Guided Navigation and Using Keyword Search.
No comments:
Post a Comment