Alfresco node lifecyle, services and zones

17
Alfresco Node Lifecycle, Services & Zones by : Sanket Mehta

description

This ppt explains you the details about an alfresco node lifecycle (including which alfresco database tables are affected upon node operation-like node creation, deletion). Apart from it, it also explain which particular case-sensitive alfresco service should be used (nodeService vs NodeService, searchService vs SearchService) in order to maintain security in your application. Lastly it covers zones in alfresco (authentication-related zones and application-related zones)

Transcript of Alfresco node lifecyle, services and zones

Page 1: Alfresco node lifecyle, services and zones

Alfresco Node Lifecycle, Services &

Zones

by : Sanket Mehta

Page 2: Alfresco node lifecyle, services and zones

Topic covered

• Alfresco services to use at right time

• Node Lifecycle

• Zones

Page 3: Alfresco node lifecyle, services and zones

NodeService vs nodeService

• When you inject the alfresco services, it is a best practice (highly recommended) to use services with upper case (ex: NodeService) instead of nodeService.

• Reason : nodeService bypasses security check, transaction check and directly performs the operation on the node.

Page 4: Alfresco node lifecyle, services and zones

Files involved

• public-services-context.xml - where NodeService is defined (it is for users to access services/beans from)

• node-services-context.xml - where actual nodeService is defined.

• core-services-context.xml - bean for registryService where NodeService and SearchService are injected.

Page 5: Alfresco node lifecyle, services and zones

Alfresco Node Lifecycle

• A node in alfresco can be called as the heart of the content repository.

• The content repository is basically composed of three imp accessories -– Database (containing node metadata)– file-system (storing the actual content)– indexes (contains node information from both - db as

well as file system)• If indexes are lost or corrupted, they can be

rebuilt using reindexing techniques.

Page 6: Alfresco node lifecyle, services and zones

Step 1 : Creation of node• User creates a node in alfresco• Content gets created at <alf_dir>/alf_data/contentstore/<date-

time>/content_uuid.bin• NOTE: This uuid is not the same as the noderef of the content.• The nodeRef of the content is stored in the database; the tables

where this node entry will be added - alf_content_data, alf_content_url and alf_node.

• alf_content_url will actually store the content url and the content nodeRef (short name).

• alf_node table will have the full nodeRef of the content along with the store id (6); which stands for workspace://SpacesStore.

• The search index will be created under alf_data/lucene-indexes/workspace/SpacesStore.

• For solr search engine, the indexes will be created under alf_data\solr\workspace\SpacesStore\index.

Page 7: Alfresco node lifecyle, services and zones

Step 2 : Deletion of node• User deletes the node.• After deletion (from any UI-DM, share or other interface), the node

lives exactly at the same place. (alf_data/contentStore)• In db (alf_node table), node is marked as living in a different store

(archive://SpacesStore - having store id 5)• With indexes, it will still remain in the search index, but its now

moved to archive store (alf_data/lucene-indexes/archive/SpacesStore).

• For solr search, it will be moved to alf_data\solr\archive\SpacesStore\index.

• At this point, if a user goes to 'Manage deleted items' from user profile, and restores the item , then : the node will move back to workspace://SpacesStore.

• Db store id will change again from 5 to 6.• And index will move from back to

alf_data/luence-indexes/workspace/SpacesStore from alf_data/luence-indexes/archive/SpacesStore.

Page 8: Alfresco node lifecyle, services and zones

Step 3: Empty the trashcan• User empties the trashcan.• Let's assume he empties the trashcan 30 days after he deleted that node.• What happens now ?• File system : Node lives at same place• DB : It's not yet deleted; it's only marked as deleted.• The alf_node table has a field named 'node_deleted' which is set to '1' to

indicate that is a deleted node.• NOTE: From alfresco 4.1.x versions, the node_deleted column is removed

from the table and sys:deleted type is applied to identify deleted nodes.• Alfresco now considers any related content file found in the file-system

content-store as 'orphan'.• At this point where the 'node_deleted' field becomes '1', the orphan is

declared by updating the 'orphan_time' field in the table alf_content_url from null to current_timestamp.

• This is done to quickly identify the orphaned items later on.• Now onwards, all db queries made by alfresco will only read the rows where

node_deleted = 0.• The search index will be empty for this node. Its removed from all search

indexes. (i.e it cannot be found either in workspace/SpacesStore or in archive/SpacesStore).

Page 9: Alfresco node lifecyle, services and zones

Step 4: Node's last breath• An orphan-cleaner job (schedular) runs (which is the contentStoreCleanerTrigger)• It executes 4 am every night, by default.• This orphan cleaner trigger doesn't act on the orphans immediately. It waits for a

period of x protected days.• That is, it queries the table (alf_content_url) for orphan_time field values greater than

14 days old.• Lookup file for reading x protected days - content-services-context.xml and further

repository.properties.• It does not actually delete the content files, instead it simply moves them out of the

'/alf_data/contentstore' folder location and into the folder 'alf_data/contentstore.deleted'.

• After moving an orphaned content file out of the active content-store, the relevant line/row in alf_content_url table is deleted from the DB.

• Orphaned content files that have been deleted from the content store, sit around the 'contentstore.deleted' folder forever... until a system administrator either backs it up, moves it, or deletes it.

• So, node gone finally from file system (contentstore to contentstore.deleted)• DB: It's still unchanged in the alf_node table (however the reference in alf_content_url

is removed)• Search index : it doesn't exist in any search index.

Page 10: Alfresco node lifecyle, services and zones

Step 5 : Removal from db• A scheduled job (nodeServiceCleanupTrigger) runs at 9 PM

everyday to clean the db.• After 30 days from when the 'node_deleted' field was set to '1', this

process considers it safe to truly delete the node.• Note: it doesn't use the audit_modifed date, since this wasn't

changed when the row was marked for deletion. Instead, it uses the commit_time_ms transaction time from the alf_transaction table.

• Note: this job also removes old transactions from the alf_transaction table. Transactions are considerd old using the same property as node removal work: '30 days'; defined using the property 'index.tracking.minRecordPurgeAgeDays').

• So, finally, the node from :• File system : Gone• DB : Gone• Index : Gone• So, after 14 days of removing a node from the archive store, it's

taken out of the content store on the file-system, and after a further 15 days (approx) it is finally removed from the database too.

Page 11: Alfresco node lifecyle, services and zones

Alfresco node lifecycle

Page 12: Alfresco node lifecyle, services and zones

Zones• Zones are used for classification of authorities.• For e.g, Alfresco synchronization uses zones to record

from which LDAP server users and groups have been synchronized.

• Zones are used to hide some groups that provide Role Based Access Control (RBAC) role-like functionality from the administration pages of the Alfresco Explorer and Alfresco Share web clients.

• Examples of hidden groups are the roles used in Alfresco Share.

• Only users and groups present in the default zone are shown on the alfresco explorer/share administration pages.

• Each and every user or group in alfresco fall under one or more zones.

Page 13: Alfresco node lifecyle, services and zones

• Zones cannot be managed from the administration pages of Alfresco Explorer and Share.

• Zones are grouped into two areas: Application-related zones and authentication-related zones.

• Application-related zones are prefixed with APP whereas authentication-related zones are prefixed with AUTH.

• Preview from : Node Browser > workspace://SpacesStore > System > zones.

Page 14: Alfresco node lifecyle, services and zones

• AuthorityContainer and Person are sub-classes of Authority and as such can be in any number of Zones.

• Example : APP.SHARE (a zone) > GROUP_site_oreilly-clms (and all the ROLES you see there) - (an authorityContainer)

• And inside authorityContainer or group there are members (person).

Page 15: Alfresco node lifecyle, services and zones

Application-related zones

• Application-related zones, other than the default (APP.DEFAULT), hide groups that implement RBAC like roles. (ex: APP.SHARE, APP.RM)

• APP.DEFAULT is for person and group nodes to be found by a normal search (through DM and Share).

• By default, each and every user and group you create in alfresco DM/Share will belong to this default zone.

• Also, ALFRESCO_ADMINS and EMAIL_CONTRIB.. (OOTB groups) will belong to default zone.

• But, oreilly-clms_SiteManager, SiteContributor, etc are ROLES; so they won't belong to this default zone.

• APP.SHARE is for hidden authorities related to Alfresco Share.

• APP.RM will be added for authorities related to RM.

Page 16: Alfresco node lifecyle, services and zones

Authentication-related zones

• Authentication-related zone is where the ROLES come into picture because the role of a person or group authenticate them to access a resource.

• AUTH.ALF is for authorities defined within Alfresco and not synchronized from an external source. This is the default zone for authentication.

• Ex: Go to AUTH.ALF zone. It shows the authorityContainers and users (belonging to those authorityContainers). But not the ones synched from LDAP.

• AUTH.EXT.<ID> is for authorities defined externally, such as in LDAP.

• Ex: Go to AUTH.EXT.supplierLDAP and AUTH.EXT.internalLDAP.• It will show up the user and groups that were synched from LDAP.

(shown based on which sync is run - full or differential)• More on LDAP sync :• http://wiki.alfresco.com/wiki/The_Synchronization_Subsystem

Page 17: Alfresco node lifecyle, services and zones

Thank you

Questions ???