Wednesday, September 21, 2011

Partial Indexing WebScripts in Alfresco


What is Partial Indexing (PI)?
In Alfresco, Lucene Engine is used to Search contents.
Partial Indexing means indexing some set of transactions, or for some specific time duration.
  • Indexing nodes for a specific year, month, day, etc.

What are different types of Indexing in Alfresco?
In alfresco-global.properties file we set indexing mode as FULL, AUTO, or VALIDATE depending upon the requirement.
index.recovery.mode= AUTO


What each indexing mode describes ?

FULL: Performs a full pass-through of all recorded transactions to ensure that the indexes are up to date.

AUTO: Performs a validation and starts a recovery if necessary. In this mode, if first N are missing then “FULL” mode is enabled. If Last N are missing then the indexes will be
"topped up" to bring them up to date but in-between will not get indexed..

VALIDATE: Checks that the first and last transaction for each store is represented in the indexes.


Which is default location for Lucene Indexes for all stores ?


Lucene Indexes Location


Where PI is useful?
CASE #1: When we have a Content Store size in Tera Bytes(TB), and the FULL indexing may take hours or a day. Stopping an active Alfresco server for such long time is not at all acceptable.

CASE #2: In a case where Indexes get corrupted and want to do AUTO indexing but not sure if all the corrupt can be recovered.


Key Benefits of Partial Indexing:
It takes lesser time.

Gives you a choice of start and end point of indexing. This is a kind of “AUTO” indexing but superior than Auto Indexing.

Performs complete indexing. Finds out number of transactions in Alfresco and divides the transactions into multiple range and run indexing in various folders. Once the indexing is over, merge those folders. This is a kind of “FULL” indexing but different then Full Indexing.

In both the approaches we have control in our hands to decide how and when to index. Decide the number of transactions and time to index.


List of OOTB PI Webscripts in Alfresco:

ADM Reindex - from a point in time or from a given ADM txn id:
1) /alfresco/service/enterprise/admin/reindex/adm/from?fromTime={fromTime}
2) /alfresco/service/enterprise/admin/reindex/adm/from?fromTxnId={fromTxnId}

ADM Reindex - get Reindex/Tracker progress
1) /alfresco/service/enterprise/admin/reindex/adm/progress

ADM Index Check - check Node Status for a nodeRef or node changes for a txnId:
1) /alfresco/service/enterprise/admin/indexcheck/adm/checknodes? nodeRef={nodeRef}
2) /alfresco/service/enterprise/admin/indexcheck/adm/checknodes?txnId={txnId}

ADM Index Check - check given Txn id or Last Txn (if txn id not specified)
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxn?txnId={txnId}
2) /alfresco/service/enterprise/admin/indexcheck/adm/checktxn

ADM Index Check - check from/to given points in time or from a given ADM Txn Id
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTxnId={fromTxnId}
2)/alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTxnId={fromTxnId}&toTxnId={toTxnId}
3) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTime={fromTime}
4) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTime={fromTime}&toTime={toTime}


ADM Index Check - check ALL Txn ids
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns/all

ADM Index Check - get Progress
1) /alfresco/service/enterprise/admin/indexcheck/adm/progress


UI of Partial Indexing :

By hitting the following URL, you can access all PI webscripts in one page. Here, you can check whether transaction ids are IN-SYNC with local indexes or not.



Partial Indexing WebScript Interface


In Association with Amazon.in

This page contains both secure and nonsecure items Security Warning in IE


When you browse to a page through HTTPS, you may receive following Security Information warning message mostly in IE browser.



1) This normally occurs because some external references exist with blank URLs (iframes, images, stylesheets, etc) that are filled in later via Java Script.
These blanks refs can sometimes be the culprit because they are internally http objects when default and via script are set to a correct URL https.


For eg:
    <iframe id="iframeID" src="" > 
    </iframe >

Here, src attribute is kept blank.


Approach 1:
To fix this we can either set some blank.html file which will be an empty file to src attribute 

<iframe id="iframeID" src="blank.html">
</iframe>

Then through script we can set the correct URL.


Approach 2:
We can set the src attribute to javascript:'';
  <iframe id="iframeID" src="javascript:'';">
  </iframe>

2) Change all links to relative path.


if the images or scripts are located on the same domain, you can access them relatively, rather than absolutely:


For eg: 
  <img src="/images/header.gif"/>

Using this method, the browser will know that it must load the image securely if the web page is being loaded securely but it will also load the image normally if the page is not being accessed securely. This is likely the best method of gettingrid of the above issue.