Wednesday, September 21, 2011

Partial Indexing WebScripts in Alfresco


What is Partial Indexing (PI)?
In Alfresco, Lucene Engine is used to Search contents.
Partial Indexing means indexing some set of transactions, or for some specific time duration.
  • Indexing nodes for a specific year, month, day, etc.

What are different types of Indexing in Alfresco?
In alfresco-global.properties file we set indexing mode as FULL, AUTO, or VALIDATE depending upon the requirement.
index.recovery.mode= AUTO


What each indexing mode describes ?

FULL: Performs a full pass-through of all recorded transactions to ensure that the indexes are up to date.

AUTO: Performs a validation and starts a recovery if necessary. In this mode, if first N are missing then “FULL” mode is enabled. If Last N are missing then the indexes will be
"topped up" to bring them up to date but in-between will not get indexed..

VALIDATE: Checks that the first and last transaction for each store is represented in the indexes.


Which is default location for Lucene Indexes for all stores ?


Lucene Indexes Location


Where PI is useful?
CASE #1: When we have a Content Store size in Tera Bytes(TB), and the FULL indexing may take hours or a day. Stopping an active Alfresco server for such long time is not at all acceptable.

CASE #2: In a case where Indexes get corrupted and want to do AUTO indexing but not sure if all the corrupt can be recovered.


Key Benefits of Partial Indexing:
It takes lesser time.

Gives you a choice of start and end point of indexing. This is a kind of “AUTO” indexing but superior than Auto Indexing.

Performs complete indexing. Finds out number of transactions in Alfresco and divides the transactions into multiple range and run indexing in various folders. Once the indexing is over, merge those folders. This is a kind of “FULL” indexing but different then Full Indexing.

In both the approaches we have control in our hands to decide how and when to index. Decide the number of transactions and time to index.


List of OOTB PI Webscripts in Alfresco:

ADM Reindex - from a point in time or from a given ADM txn id:
1) /alfresco/service/enterprise/admin/reindex/adm/from?fromTime={fromTime}
2) /alfresco/service/enterprise/admin/reindex/adm/from?fromTxnId={fromTxnId}

ADM Reindex - get Reindex/Tracker progress
1) /alfresco/service/enterprise/admin/reindex/adm/progress

ADM Index Check - check Node Status for a nodeRef or node changes for a txnId:
1) /alfresco/service/enterprise/admin/indexcheck/adm/checknodes? nodeRef={nodeRef}
2) /alfresco/service/enterprise/admin/indexcheck/adm/checknodes?txnId={txnId}

ADM Index Check - check given Txn id or Last Txn (if txn id not specified)
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxn?txnId={txnId}
2) /alfresco/service/enterprise/admin/indexcheck/adm/checktxn

ADM Index Check - check from/to given points in time or from a given ADM Txn Id
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTxnId={fromTxnId}
2)/alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTxnId={fromTxnId}&toTxnId={toTxnId}
3) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTime={fromTime}
4) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns?fromTime={fromTime}&toTime={toTime}


ADM Index Check - check ALL Txn ids
1) /alfresco/service/enterprise/admin/indexcheck/adm/checktxns/all

ADM Index Check - get Progress
1) /alfresco/service/enterprise/admin/indexcheck/adm/progress


UI of Partial Indexing :

By hitting the following URL, you can access all PI webscripts in one page. Here, you can check whether transaction ids are IN-SYNC with local indexes or not.



Partial Indexing WebScript Interface


In Association with Amazon.in

No comments:

Post a Comment