Thursday, October 11, 2012

Oracle ATG Search 9.3 configuration on Local (Windows 7 , 64 bit) machine

Recently installed ATG search 9.3 on local windows machine and configured search environment and index.  It was a good fun as it gave quite some errors to fix before running my first index on local machine.  I would try to give all the main steps and error faced in this process.


Lets first start with prerequisites –
You should have a working/running Store and BCC JBoss instances in your local system. I also assume that you have already installed Search 9.3 in your local ATG home.

Now let’s focus on search configuration. Kindly follow below steps in same sequence.

1) Delete any existing projects and Environments from Search Admin in BCC Admin screen.
2) Goto Store ACC and delete everything from SearchConfigurationRepository under Content. You may leave searchindex and logicalpartition items in it.

3) Set following remoteHost=localhost , remotePort=8860 in below components as per your local store.
C:\ATG\ATG9.1\home\servers\pub\localconfig\atg\commerce\search\refinement\RemoteCatalogRefineConfigAdapter.properties
C:\ATG\ATG9.1\home\servers\pub\localconfig\atg\commerce\search\config\RemoteCatalogRankConfigAdapter.properties
C:\ATG\ATG9.1\home\servers\pub\localconfig\atg\commerce\search\config\RemoteSearchUpdateAdapter.properties

4) Now goto BCC and create a new search project called production under Search Admin
and add content as below
ATG Repository - Local - /atg/commerce/search/ProductCatalogOutputConfig

5) Now goto "Language Customizations(Pre-Index)" under content option of new project(production)
a) Add Synonyms by using option "Select Custom Term Dictionaries"
b) Add languages like english , italian, french, german, japansese,korean using option "core language support"

6) Now add Post-Index Customizations options in the below sequence ONLY
Merchandising Search Config    - Remote - localhost:8860 - /atg/commerce/search/config/CatalogRankConfigAdapter
Refine Config (Facet Set)           - Remote - localhost:8860 - /atg/commerce/search/refinement/CatalogRefineConfigAdapter
Search Update Config (Auxiliary Data)      - Remote - localhost:8860 - /atg/commerce/search/config/SearchUpdateAdapter

7) Check the Environment - You should see one default one just rename it to production
8) Now build the index and see if it works

If you get this below error at BCC server startup then do the following ---
ERROR [IndexDeploymentService] Swap failed for index 35000001; aborting swap and tearing down staging. Swap rollback Policy Description: All physical partitions must have at least one successfully initialized search engine

1) Add this in Store jboss server run.conf file and restart your store server.
JAVA_OPTS="$JAVA_OPTS -Datg.allowRemoveAllItems=true"

2) Now goto below component in your Store server dyn/admin and do a remove-all-items on all items under SearchConfigurationRepository Item Descriptor.
http://localhost:8180/dyn/admin/nucleus//atg/search/routing/repository/SearchConfigurationRepository/

swapcheck
searchEngine
searchMachine
searchEnvironmentHost
searchEnvironment
physicalPartition
searchIndex
logicalPartition
indexingActivitySummary
indexingCommandCount
deploymentHistory

3) Now do the whole setup again in BCC which we did in above steps thru 4 to 9.

Note -- You also need to increase the TransactionTimeout setting in jboss-service.xml under your store server's conf/ folder.
30000

Note -- You also need to increase heap size setting in JBOSS_OPTS variable in Jboss run.bat file

References –

Let me know if you face any issues with Search setup or if you know of any better way to configure search.

Tuesday, September 04, 2012

SQL Error: ORA-01591: lock held by in-doubt distributed transaction

Well last week I came across this ORA-01591 while running alter command against one of the Oracle DB schema. It simply failed by giving following error. 


SQL Error: ORA-01591: lock held by in-doubt distributed transaction 5.4.426183
01591. 00000 -  "lock held by in-doubt distributed transaction %s"
*Cause:    Trying to access resource that is locked by a dead two-phase commit
           transaction that is in prepared state.
*Action:   DBA should query the pending_trans$ and related tables, and attempt
           to repair network connection(s) to coordinator and commit point.
           If timely repair is not possible, DBA should contact DBA at commit
           point if known or end user for correct outcome, or use heuristic
           default if given to issue a heuristic commit or abort command to
           finalize the local portion of the distributed transaction.

I did Google around and got some help to resolve this. Here are the steps I followed to fix this issue. 
1) Connect to Oracle as sysdba by using this command -- sqlplus sys as sysdba

2) select LOCAL_TRAN_ID from dba_2pc_pending
This above select command will give you list of all pending transaction ids and one id will match with the id mentioned in the above error. you have to pick that id and delete it. Before attempting to delete (step 5) you need to first execute steps 3 and 4. 

3) alter system enable distributed recovery;
   Above statement enables distributed recovery

4) rollback force '5.4.426183';
    commit;
 Note - Use ROLLBACK statement with the FORCE option and a text string that indicates either the local or global transaction ID of the in-doubt transaction to commit.

5) execute dbms_transaction.purge_lost_db_entry('5.4.426183');

Once the above procedure executes successfully then you may try your original Alter/DDL command and restart your app server.

Friday, August 10, 2012

Purging Asset Versions in BCC using ATG dynamo PurgingService


Recently faced couple of issues with running the Purge Service under BCC and had to do some tuning to finally make it work on large volume of versioned assets.
It is generally a good idea to periodically purge versioned repository data of old projects and asset versions. Over the period of  time, the versioning system in Content Administration (CA) can accumulate a large number of asset versions and completed projects. As asset versions accumulate, they can strain storage capacity and system performance. It also becomes hard to take live data copy and replicate to other environments.


The length of a purge depends on the number of repository and file assets that need to be purged. A purge that has  large number assets can be lengthy specially for the first time. Try scheduling multiple purges in this case.
Its also a good idea to take back up of all affected datastores and file systems before you start a purge.

Purge Service generates a Summary Metrics report before starting the purge activity and another one after the purge is completed.
The report basically has details like number of projects and asset versions removed, and the number of projects and asset versions that remain. It has a good amount of details as what is going to be purged and what will remain after that.

You might like to do few changes in your BCC server instance to run the purge service as its likely that purge will fail initially due to various reasons. Worth trying these -

(1) Purge operation executes in a transaction. If a purge has a large number of assets, you might need to raise your application server’s transaction timeout setting—for JBoss reset TransactionTimeout attribute        (in /server/yourserver/conf/jboss-service.xml file).

(2) JVM memory setting in Jboss - It depends on volume of data you have. If you get any memory error while performing Purge activity then consider increasing memory by 1 GB and keep increasing till the time memory error goes off. In ideal condition going upto 6 GB would be good enough. (-Xms6144m -Xmx6144m) under Jboss/bin/run.bat on Windows.

(3) Resolving repository data conflict - Purge service might failed on ContentRepository data, to fix this try setting VersionManagerService.enableProtectivePurge as false and then run the purge service.

For more details read Oracle ATG CA docs here –

Wednesday, April 18, 2012

Effective use of robots.txt

Off late I did some work on SEO and got an opprtunity to play around the robots.txt file to apply various rules. Would like to share the common understanding around it, feel free to provide your comments or share your experiences.

As part of sensible SEO practice its important to keep a firm grasp on managing exactly what information we don't want being crawled!
A robots.txt file restricts access to your site by search engine robots that crawl the web. These bots are automated and before they access pages of a site, they check to see if a robots.txt file exists that prevents them from accessing certain pages.
You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file

The simplest robots.txt file uses two rules:
User-agent: the robot the following rule applies to
Disallow: the URL you want to block

These two lines are considered a single entry in the file. You can include as many entries as you want. You can include multiple Disallow lines and multiple user-agents in one entry.

Some example below -

User-agent: *
Disallow: /images/

User-Agent: Googlebot
Disallow: /archive/

The Disallow line lists the pages you want to block. You can list a specific URL or a pattern. The entry should begin with a forward slash (/).

  • To block the entire site, use a forward slash.
Disallow: /

  • To block a directory and everything in it, follow the directory name with a forward slash.

Disallow: /archive-directory/

  • To block a page, list the page.

Disallow: /checkout.jsp

  • To remove a specific image from Google Images, add the following:

User-agent: Googlebot-Image
Disallow: /images/logo.jpg

  • To remove all images on your site from Google Images:

User-agent: Googlebot-Image
Disallow: /

  • To block files of a specific file type (for example, .gif), use the following:

User-agent: Googlebot
Disallow: /*.gif$

  • To specify matching the end of a URL, use $. For instance, to block any URLs that end with .xls:

User-agent: Googlebot
Disallow: /*.xls$

We can restrict crawling where it's not needed with robots.txt
A "robots.txt" file tells search engines whether they can access and therefore crawl parts of your site. This file, which must be named "robots.txt", is placed in the root directory of your site. e.g - www.example.com/robots.txt

If you have a multi country site then each country should have its own robots.txt
To further read follow these links for generating and using robots.txt

robots.txt generator
Using robots.txt files
Caveats of each URL blocking method

Kindly note Google has a limit of only being able to process up to 500KB of your robots.txt file.

Popular Posts