Latest Tweets

Altering the Dspace tree nodes’ submitters group(s) recursively

The issue

Our current implementation of Dspace software does not allow to alter recursively all Collections’ submitters group[s]. That means we have only one way to do that; item by item, which is, obviously, quite annoying. So it seems to be completely necessary to write a trivial script with a trivial recursive subroutine to browse the tree and update the right nodes once and for all. That’s what we will discuss here.

A classic tree structure to store data

Take a quick look at the next figure; it shows a classic tree view concerning how “Communities” , “Subcommunities” and “Collections” terms are stored and accesed on Dspace software, internally.

tree

As shown in the figure, CO are the “Communities” and “Subcommunities”, and CL are the sheeves nodes, known as “Collections” on Dspace terminology. As a matter of fact, only the nodes labeled as CL have the submitter property. Peering inside Dspace database structure, the table where a CL node is stored is called Collections. Inside this table, the property Submitters stores an integer value, referencing a EPerson or a EGroup entry. Looking at the figure, it is quite obvious a CO node can have different node-children which, in turn, can be CO or CL nodes.

As explained before, Dspace has no options to update recursively using its visual administrative tools the “Submitter” property on Collection relation. There’s no option at all.

Thus, I wrote a php script in charge of dealing with the PostgreSQL Dspace relations directly, browsing the tree and updating the Submitters property for the relation  Collection as necesary.

However, the Dspace software is in need of adding an extension’s row  for any CL node concerning the same EGroup or EPeople ID in the relation ResourcePolicy, with the ADD permission.

So the script adds, at the same time, an extension’s row for any CL node updated or removes it, in case of not allowing the group as a valid tree’s submitter. That is, when adding the group “MyGroup” as a submitter for child-nodes CO11,CO111,CL1111, it updates the CL1111[submitter] and the CL1111[admin] properties and adds T11, T111, T1111 extension’s rows to the relation ResourcePolicy.

The script

Written using php, it can be executed directly from a terminal. It admits different arguments. Below, how it looks like:

/* The allowed groups coming from LDAP service allowed to use Dspace software */
/* These fields are gather directly from the LDAP database at UPC. */
define(_UNIT_CODE_FA,720);
define(_UNIT_CODE_FEN,721);
 
/* The groups inserted inside the database allowed to submit collections */
/* These groups must exist inside "Dspace" database. You can create them using the "visual" Administer groups tool. */
define(_SUBMITTER_GROUP_FA,182);
define(_SUBMITTER_GROUP_FEN,181);
 
/* Define the two main communities where we start dealing with */
/* Take a look at the DSPACE's database */
define(_FA_COMMUNITY,10);
define(_FEN_COMMUNITY,11);
 
/* Those parameters set up a generic user and password for binding to LDAP server, and
the LDAP server by itself*/
define(_LDAP_SERVER,"mi_ldap_server");
define(_LDAP_BASE_DN,"dc=mydc,dc=mydc");
define(_LDAP_UNIT_CODE,"my_unit_code_field");
define(_LDAP_USER,"binding_info_data");
define(_LDAP_PWD,"binding_password");
 
/* PostgreSQL Database options */
define(_POSTGRESQL_HOST,"my_postgresql_host");
define(_POSTGRESQL_PORT,666);
define(_POSTGRESQL_DATABASE,"Dspace database name");		// Get this option value directly from /usr/src/dspace-XXX/config/dspace.conf
define(_POSTGRESQL_USER,"dspace username");
define(_POSTGRESQL_PWD,"database password");
 
/* The tables where users, groups, collections, communities and policies reside */
define(_DSPACE_GROUP_TABLE,"EPersonGroup");
define(_DSPACE_USER_TABLE,"EPerson");
define(_DSPACE_COMMUNITY_TABLE,"community");
define(_DSPACE_COLLECTION_TABLE,"collection");
define(_DSPACE_POLICES_TABLE,"ResourcePolicy");
 
// The valid flags passed to the php script:
//	-u	name.surname	- must exist on LDAP directory -
//	-i	Retrieve information about the user - just the user in the format name.surname -
//	-a	Add the user or the group to the submitters of FA or FEN.
//	-r	Remove the user or the group from the submitters of FA or FEN.
//	-b The user or the group belongs to fa or fen: -b fa , -b fen.
//
define(_VALID_FLAGS,"u:i:a:r:b:");
define(_USAGE_MESSAGE,"dspace-update-users -i|-u ldap.user [-a|-r community|subcommunity] [-b fa|fen]\n");
 
// ------------------------------------------------------------------------------------------------------------------------------
//    ldap_get_user_unitCode
//        Args:    string    =    user in the format "name.surname".
//        Ret:    int    =    unitCode or -1, in case of error
// ------------------------------------------------------------------------------------------------------------------------------
function ldap_get_user_unitCode ($user){
 $retVal = -1;
 $lc = ldap_connect(_LDAP_SERVER);
 if($lc){
 ldap_set_option($lc, LDAP_OPT_PROTOCOL_VERSION, 3);
 /* Okay, we're connected now. Try to "bind" : */
 $lbind = ldap_bind($lc, _LDAP_USER, _LDAP_PWD);
 if($lbind){
 /* We're binded, ask for the user : */
 $flds = array(_LDAP_UNIT_CODE);
 $lrs = ldap_search($lc , _LDAP_BASE_DN, "(cn=" . $user . ")" , $flds);
 /* Get the total fields returned: */
 $entries = ldap_get_entries($lc, $lrs);
 if($entries["count"]!=0){
 /* Only when we've got at least ONE entry with unitCode 720 or 721, that's the unitCode */
 /* If we reached the end of the array with no "720" or "721" unitcode field, just return the right value */
 for($ientry=0;$ientry<$entries["count"];$ientry++){
 $retVal = strval($entries[$ientry][_LDAP_UNIT_CODE][0]);
 if($retVal==_UNIT_CODES_FA || $retVal==_UNIT_CODES_FEN)break;
 }
 } ldap_unbind($lc);    // Disconnect
 }
 }return $retVal;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//    ldap_convert_user_to_dspace_group
//        Args:    string    =    user in the format "name.surname".
//        This function gets the user's unitCode ldap field and returns his or her dspace "submitter" group, according to the
//        constants previously declared some lines above.
// ------------------------------------------------------------------------------------------------------------------------------
function ldap_convert_user_to_dspace_group ($user){
 if(ldap_get_user_unitCode($user)==_UNIT_CODE_FA)return _SUBMITTER_GROUP_FA;
 if(ldap_get_user_unitCode($user)==_UNIT_CODE_FEN)return _SUBMITTER_GROUP_FEN;
 return -1;    /* It is not feasible to return a valid submitter  group ... */
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//    Establish a trivial connection to the postgreSQL server where Dspace's database is stored. Returns a valid link reference
//    or -1 in case of error.
// ------------------------------------------------------------------------------------------------------------------------------
function postgreSQL_connect () {
 $connection_str_template="host=%s port=%d dbname=%s user=%s password=%s";
 $connection_str = sprintf($connection_str_template,_POSTGRESQL_HOST,_POSTGRESQL_PORT,_POSTGRESQL_DATABASE,
 _POSTGRESQL_USER, _POSTGRESQL_PWD);
 /* Okay, try to connect */
 $lref = pg_connect($connection_str);
 return $lref;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//    Close the previously established connection.
//        Args:    link    =    Connection reference to postgreSQL database
// ------------------------------------------------------------------------------------------------------------------------------
function postgreSQL_close ($lp) {
 pg_close($lp);
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//    dspace_look_for_group
//        Args:    string    =    group,    group to look for inside Dspace's database.
//            link    =    A link to the postgreSQL database.
//        Ret:    int    =    The id for this group, or -1 if it does not exist.
//
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_look_for_group ($lp, $group){
 $qry = "SELECT eperson_group_id FROM " . _DSPACE_GROUP_TABLE . " WHERE name='" . $group . "';";
 $rs = pg_query($lp, $qry);
 if($rs && pg_num_rows($rs)>0){
 /* Get just the first row : */
 $record = pg_fetch_row($rs , 0); return $record[0];
 }return -1;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_look_for_user
//		Args:	string	=	user, user to look for inside Dspace's database.
//			link	=	A link to the postgreSQL database.
//		Ret:	int	=	The id for this user, or -1 if it does not exist.
//
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_look_for_user ($lp, $user){
	$qry = "SELECT eperson_id FROM " . _DSPACE_USER_TABLE . " WHERE netid='" . $user . "';";
	$rs = pg_query($lp, $qry);
	if($rs && pg_num_rows($rs)>0){
		/* Get just the first row : */
		$record = pg_fetch_row($rs , 0); return $record[0];
	}return -1;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_look_for_community
//		Args:	string	=	community to look for
//			link	=	A link to the postgreSQL database.
//		Ret:	int	=	The id for this community, or -1 if it does not exist.
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_look_for_community ($lp , $community) {
	$qry = "SELECT community_id FROM " . _DSPACE_COMMUNITY_TABLE . " WHERE name='" . $community . "';";
	$rs = pg_query($lp, $qry);
	if($rs && pg_num_rows($rs)>0){
		/* Get just the first row : */
		$record = pg_fetch_row($rs , 0); return $record[0];
	}return -1;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_get_community_name
//		Args:	int	=	community_id to get its name
//			link	=	A link to the postgreSQL database.
//		Ret:	string	=	The community's name or NULL, if does not exit.
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_get_community_name ($lp, $community_id){
	$qry = "SELECT name FROM " . _DSPACE_COMMUNITY_TABLE . " WHERE community_id=" . $community_id . ";";
	$rs = pg_query($lp, $qry);
	if($rs && pg_num_rows($rs)>0){
		/* Get just the first row : */
		$record = pg_fetch_row($rs , 0); return $record[0];
	}return NULL;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_get_collection_name
//		Args:	int	=	collection_id to get its name
//			link	=	A link to the postgreSQL database.
//		Ret:	string	=	The collection's name or NULL, if does not exit.
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_get_collection_name ($lp, $collection_id){
	$qry = "SELECT name FROM " . _DSPACE_COLLECTION_TABLE . " WHERE collection_id=" . $collection_id . ";";
	$rs = pg_query($lp, $qry);
	if($rs && pg_num_rows($rs)>0){
		/* Get just the first row : */
		$record = pg_fetch_row($rs , 0); return $record[0];
	}return NULL;
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_gather_communities
//			link	=	Connection to the database
//		Ret array	=	The ids for all childs inside this community
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_gather_communities ($lp, $community_id){
	/* Bear in mind, the field "child_comm_id" is the "real" id for the "child node" ... not the "id" by itself eh! ;-) */
	$rs = pg_query($lp, "SELECT child_comm_id FROM Community2community WHERE parent_comm_id=" . $community_id . ";");
	if($rs && pg_num_rows($rs)>0){
		$comms = Array();
		while($record = pg_fetch_row($rs)) $comms[] = $record[0];
		return $comms;
	}return -1;	/* Error */
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_get_collections_for_a_community
//			link	=	Connection to the database
//		Ret array	=	The ids for all collections - sheeves nodes - inside this community
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_get_collections_for_a_community ($lp, $community_id){
	/* For any given community id, try to get its collections. If it has, return in the array. */
	$rs = pg_query($lp,"SELECT collection_id FROM Community2Collection WHERE community_id=" . $community_id . ";");
	if($rs && pg_num_rows($rs)>0){
		$colls = Array();
		while($record = pg_fetch_row($rs)) $colls[] = $record[0];
		return $colls;
	}return -1;	/* There's no collections inside this community */
}
 
// ------------------------------------------------------------------------------------------------------------------------------
//	dspace_recursive_communities
//		Gets all communities, subcommunities and collections from a starting point "community" id passed as an argument to
//		this function. Alternatively, it can update the "polices" and "submitters" over any collection found.
//
//		Args
//			lp				=		PostgreSQL connection
//			user			=		Username involved in this operation. In case "dspace_group" is null, the "group" will be gather
//									thanks to the "ldap" query.
//			community_id	=		The "start" point from where look for communities, subcommunities and collections recursively.
//			add_or_remove	=		1, we are adding the "group" as a submitter. 0, the other way a
round.
//			dspace_group	=		The dspace group to add or remove as a submitter. It could be null, in this case the group
//									will be determined by the function "ldap_convert_user_to_dspace_group".
//			noupdate		=		TRUE; just show the tree recursively and do not perform any updates on the database.
//
// ------------------------------------------------------------------------------------------------------------------------------
function dspace_recursive_communities ($lp, $user, $community_id,$add_or_remove,$dspace_group,$noupdate=FALSE){
 
	/* Apply the right dspace group as a submitter: */
	if($dspace_group==""){
		$fa_or_fen = ldap_convert_user_to_dspace_group($user);
		if($fa_or_fen==-1)
			/* Do nothing, to skip database inconsistencies: */
			return -1;
	}else $fa_or_fen = $dspace_group;
 
	/* First, show all the communities for this community : */
	echo ">> " . dspace_get_community_name($lp,$community_id) . ": ";
	/* Get the sub-communities : */
	$cms = dspace_gather_communities($lp, $community_id);
	if($cms!=-1){
		echo "\n";
		/* Recursively gather all subcommunities : */
		for($ic = 0; $ic < count($cms); $ic++){
			echo "\t" . dspace_get_community_name($lp, $cms[$ic]) . "\n";
			/* Has this community collections? */
			if(-1!=($cls = dspace_get_collections_for_a_community($lp, $cms[$ic]))){
				/* Show them all : */
				for($icol = 0; $icol < count($cls); $icol++){
					echo "\t\t" . dspace_get_collection_name($lp,$cls[$icol]) . "\n";
					/* Now, determine if we need to "add" or to "remove" a submitter group : */
					if($noupdate==FALSE){
						if($add_or_remove==1){
							// Add
							$upd = "UPDATE " . _DSPACE_COLLECTION_TABLE . " SET submitter=" . $fa_or_fen . "," .
								" admin=" . $fa_or_fen . " WHERE collection_id=" .
								$cls[$icol] . ";";
							$rs = pg_query($lp, $upd);
							if(!$rs || pg_affected_rows($rs)!=1)echo "ERROR updating privileges on " . $cls[$icol];
							/* Update the "ResourcePolicy" table, too : */
							$updr = "INSERT INTO ResourcePolicy (policy_id,resource_type_id, resource_id,action_id,epersongroup_id) VALUES (" .
									mt_rand() . "," .
									"3," .
									$cls[$icol] . "," .
									"3," .
									$fa_or_fen .
								");";
							$rs = pg_query($lp, $updr);
							if(!$rs || pg_affected_rows($rs)!=1)echo "ERROR updating privileges on " . $cls[$icol];
						}else{
							// The idea concerning "removing" a group is quite easy: just update the tree and set submitter=NULL.
							// At the same time, remove any "ResourcePolicy" tuple associated with this previous group.
							$upd = "UPDATE " . _DSPACE_COLLECTION_TABLE . " SET submitter=NULL," .
							       " admin=NULL WHERE collection_id=" .
							       $cls[$icol] . ";";
							       $rs = pg_query($lp, $upd);
							if(!$rs || pg_affected_rows($rs)!=1)echo "ERROR updating privileges on " . $cls[$icol];
						}
					}
				}
			}
			/* Now , get the "subcommunities" inside this community: */
			dspace_recursive_communities($lp, $user, $cms[$ic], $add_or_remove,$dspace_group);
		}
	} else echo " [ NONE ]\n";
}
 
/* ENTRY POINT */
/* Get the arguments passed on command line and store them in the array :*/
$cmd_args = getopt(_VALID_FLAGS);
 
/* If there is no args, show the "usage" and returns -1 : */
if(count($cmd_args)==0){
	echo _USAGE_MESSAGE;
	return -1;
}
 
$myref = postgreSQL_connect();
 
/* Retrieve through LDAP the user's unitCode  and the "obvious" Dspace's submitter group, if possible: */
if($cmd_args[i]){
	echo "unitCode for user: " . $cmd_args[i] . " is " . ldap_get_user_unitCode($cmd_args[i]) . ", " .
		" and Dspace submitter group should be: " . ldap_convert_user_to_dspace_group($cmd_args[i])  . "\n";
	return 0;
}
 
/* We've got a user and a "community", to add or to remove this user to|from it: */
if($cmd_args[u] && $cmd_args[b]){
 
	/* In case we are "forcing" the dspace group , be assure there are only two options: fa or fen! */
	if($cmd_args[b] != "fa" && $cmd_args[b] != "fen" ){
		echo _USAGE_MESSAGE;
		postgreSQL_close($myref);
		return -1;
	}
 
	/* Update the right Dspace group submitter */
	if($cmd_args[b]=="fa")$grp = _SUBMITTER_GROUP_FA;
	if($cmd_args[b]=="fen")$grp=_SUBMITTER_GROUP_FEN;
 
	/* Skip inconsistencies, like associating _SUBMITTER_GROUP_FA with _FEN_COMMUNITY or the other way around  : */
	if($cmd_args[a]){
		if(($grp==_SUBMITTER_GROUP_FA && strval($cmd_args[a])!=_FA_COMMUNITY) ||
			($grp==_SUBMITTER_GROUP_FEN && strval($cmd_args[a])!=_FEN_COMMUNITY)){
			echo "Error, you have to assign a FA submitter group to the FA community.\n";
			postgreSQL_close($myref);
			return -1;
		}
	}
	if($cmd_args[r]){
		if(($grp==_SUBMITTER_GROUP_FA && strval($cmd_args[r])!=_FA_COMMUNITY) ||
			($grp==_SUBMITTER_GROUP_FEN && strval($cmd_args[r])!=_FEN_COMMUNITY)){
			echo "Error, you have to assign a FA submitter group to the FA community.\n";
			postgreSQL_close($myref);
			return -1;
		}
	}
 
	/* Update the tree and the policies right now: */
	echo "Getting unitCode for user: $cmd_args[u] ...: " . ldap_get_user_unitCode($cmd_args[u]) . "\n";
	if($cmd_args[a]){	/* Add to the community */
		echo "Adding to the community: $cmd_args[b]. Recommended submitter's group: " . ldap_convert_user_to_dspace_group($cmd_args[u]) . "\n";
		// First of all, remove policies ... : */
		pg_query($myref, "DELETE FROM ResourcePolicy WHERE epersongroup_id=" . $grp . ";");
		dspace_recursive_communities ($myref , $cmd_args[u] , $cmd_args[a],1,$grp);
	}
	if($cmd_args[r]){	/* Remove from the community */
		echo "Removing from the community: $cmd_args[b] as a submitter.\n";
		// First of all, remove policies ... : */
		pg_query($myref, "DELETE FROM ResourcePolicy WHERE epersongroup_id=" . $grp . ";");
		dspace_recursive_communities ($myref , $cmd_args[u] , $cmd_args[r],0,$grp);
	}
}
 
// Disconnect :
postgreSQL_close($myref);
 
return 0;

Running the script

When it is necessary to update recursively any given CO node with all its CO child-nodes and all its CL sheaves-nodes, all we need to do is to execute the script via command prompt this way:

php -f ./dspace-update-users.php — -u ldap_username -a COn -b fa|fen

where COn will be the starting CO node in the tree to update.

It is feasible to do the other way around, that is, to “remove” a submitter group from the tree executing:

php -f ./dspace-update-users.php — -u ldap_username -r COn -b fa|fen

All CO nodes and CL nodes will be shown on the screen.