Class InPlace
- java.lang.Object
-
- edu.isi.pegasus.planner.refiner.cleanup.InPlace
-
- All Implemented Interfaces:
CleanupStrategy
public class InPlace extends java.lang.Object implements CleanupStrategy
This generates cleanup jobs in the workflow itself.- Version:
- $Revision$
- Author:
- Arun ramakrishnan, Karan Vahi
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
CLEANUP_JOB_PREFIX
The prefix for CLEANUP_JOB ID i.e prefix+the parent compute_job ID becomes ID of the cleanup job.static int
DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
The default value for the number of clustered cleanup jobs created per level.static java.lang.String
DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
The default value for the maxjobs variable for the category of cleanup jobs.private int
mCleanupJobsPerLevel
The number of cleanup jobs per level to be createdprivate int
mCleanupJobsSize
the number of cleanup jobs clustered into a clustered cleanup jobprivate java.util.HashSet
mDoNotClean
HashSet of Files that should not be cleaned upprivate CleanupImplementation
mImpl
The handle to the CleanupImplementation instance that creates the jobs for us.private LogManager
mLogger
The handle to the logging object used for logging.private int
mMaxDepth
The max depth of any job in the workflow useful for a priorityQueue implementation in an arrayprivate PegasusProperties
mProps
The handle to the properties passed to Pegasus.private java.util.HashMap
mResMap
The mapping to siteHandle to all the jobs that are mapped to it mapping to siteHandle(String) to Setprivate java.util.HashMap
mResMapLeaves
The mapping of siteHandle to all subset of the jobs mapped to it that are leaves in the workflow mapping to siteHandle(String) to Set. private java.util.HashMap
mResMapRoots
The mapping of siteHandle to all subset of the jobs mapped to it that are roots in the workflow mapping to siteHandle(String) to Set. private boolean
mUseSizeFactor
A boolean indicating whether we prefer use the size factor or the num factor-
Fields inherited from interface edu.isi.pegasus.planner.refiner.cleanup.CleanupStrategy
VERSION
-
-
Constructor Summary
Constructors Constructor Description InPlace()
The default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Graph
addCleanupJobs(Graph workflow)
Adds cleanup jobs to the workflow.private void
addCleanUpJobs(java.lang.String site, java.util.Set leaves, Graph workflow)
Adds cleanup jobs for the workflow scheduled to a particular site a breadth first search strategy is implemented based on the depth of the job in the workflowprotected void
applyJobPriorities(Graph workflow)
Adds job priorities to the jobs in the workflow on the basis of the levels in the traversal order given by the iterator.private java.util.List<GraphNode>
clusterCleanupGraphNodes(java.util.List<GraphNode> cleanupNodes, java.util.HashMap cleanedBy, java.lang.String site, int level)
Takes in a list of cleanup nodes ,one per cleanupNode(compute/stageout job) whose files need to be deleted) and clusters them into a smaller set of cleanup nodes.private GraphNode
createClusteredCleanupGraphNode(java.util.List<GraphNode> nodes, java.util.HashMap cleanedBy, java.lang.String site, int level, int index)
Creates a clustered cleanup graph node that aggregates multiple cleanup nodes into one nodeprotected java.lang.String
generateCleanupID(Job job)
Returns the identifier that is to be assigned to cleanup job.java.lang.String
generateClusteredJobID(java.lang.String site, int level, int index)
Generated an ID for a clustered cleanup jobprivate int
getClusterSize(int size)
Returns the number of cleanup jobs clustered into one job per level.java.lang.String
getDefaultCleanupMaxJobsPropertyKey()
Returns the property key that can be used to set the max jobs for the default category associated with the registration jobs.protected java.lang.String
getSiteForCleanup(Job job)
Returns site to be used for the cleanup algorithm.void
initialize(PegasusBag bag, CleanupImplementation impl)
Intializes the class.protected void
reduceDependency(GraphNode node)
Reduces the number of edges between the nodes and it's parents.protected void
reset()
Resets the internal data structures.private void
setDepth_ResMap(java.util.List roots)
A BFS implementation to set depth value (roots have depth 1) and also to populate mResMap ,mResMapLeaves,mResMapRoots which contains all the jobs that are assigned to a particular resourceprotected boolean
typeNeedsCleanUp(int type)
Checks to see which job types are required to be looked at for cleanup.protected boolean
typeNeedsCleanUp(GraphNode node)
Checks to see which job types are required to be looked at for cleanup.protected boolean
typeStageOut(int type)
Checks to see if job type is a stageout job type.
-
-
-
Field Detail
-
CLEANUP_JOB_PREFIX
public static final java.lang.String CLEANUP_JOB_PREFIX
The prefix for CLEANUP_JOB ID i.e prefix+the parent compute_job ID becomes ID of the cleanup job.- See Also:
- Constant Field Values
-
DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
public static final java.lang.String DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
The default value for the maxjobs variable for the category of cleanup jobs.- See Also:
- Constant Field Values
-
DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
public static final int DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
The default value for the number of clustered cleanup jobs created per level.- See Also:
- Constant Field Values
-
mResMap
private java.util.HashMap mResMap
The mapping to siteHandle to all the jobs that are mapped to it mapping to siteHandle(String) to Set
-
mResMapLeaves
private java.util.HashMap mResMapLeaves
The mapping of siteHandle to all subset of the jobs mapped to it that are leaves in the workflow mapping to siteHandle(String) to Set.
-
mResMapRoots
private java.util.HashMap mResMapRoots
The mapping of siteHandle to all subset of the jobs mapped to it that are roots in the workflow mapping to siteHandle(String) to Set.
-
mMaxDepth
private int mMaxDepth
The max depth of any job in the workflow useful for a priorityQueue implementation in an array
-
mDoNotClean
private java.util.HashSet mDoNotClean
HashSet of Files that should not be cleaned up
-
mImpl
private CleanupImplementation mImpl
The handle to the CleanupImplementation instance that creates the jobs for us.
-
mProps
private PegasusProperties mProps
The handle to the properties passed to Pegasus.
-
mLogger
private LogManager mLogger
The handle to the logging object used for logging.
-
mCleanupJobsPerLevel
private int mCleanupJobsPerLevel
The number of cleanup jobs per level to be created
-
mCleanupJobsSize
private int mCleanupJobsSize
the number of cleanup jobs clustered into a clustered cleanup job
-
mUseSizeFactor
private boolean mUseSizeFactor
A boolean indicating whether we prefer use the size factor or the num factor
-
-
Method Detail
-
initialize
public void initialize(PegasusBag bag, CleanupImplementation impl)
Intializes the class.- Specified by:
initialize
in interfaceCleanupStrategy
- Parameters:
bag
- bag of initialization objectsimpl
- the implementation instance that creates cleanup job
-
addCleanupJobs
public Graph addCleanupJobs(Graph workflow)
Adds cleanup jobs to the workflow.- Specified by:
addCleanupJobs
in interfaceCleanupStrategy
- Parameters:
workflow
- the workflow to add cleanup jobs to.- Returns:
- the workflow with cleanup jobs added to it.
-
reset
protected void reset()
Resets the internal data structures.
-
setDepth_ResMap
private void setDepth_ResMap(java.util.List roots)
A BFS implementation to set depth value (roots have depth 1) and also to populate mResMap ,mResMapLeaves,mResMapRoots which contains all the jobs that are assigned to a particular resource- Parameters:
roots
- List of GraphNode objects that are roots
-
addCleanUpJobs
private void addCleanUpJobs(java.lang.String site, java.util.Set leaves, Graph workflow)
Adds cleanup jobs for the workflow scheduled to a particular site a breadth first search strategy is implemented based on the depth of the job in the workflow- Parameters:
site
- the site IDleaves
- the leaf jobs that are scheduled to siteworkflow
- the Graph into which new cleanup jobs can be added
-
reduceDependency
protected void reduceDependency(GraphNode node)
Reduces the number of edges between the nodes and it's parents.For the node look at the parents of the Node. For each parent Y see if there is a path to any other parent Z of X. If a path exists, then the edge from Z to node can be removed.
- Parameters:
node
- the nodes whose parent edges need to be reduced.
-
applyJobPriorities
protected void applyJobPriorities(Graph workflow)
Adds job priorities to the jobs in the workflow on the basis of the levels in the traversal order given by the iterator. Later on this would be a separate refiner.- Parameters:
workflow
- the workflow on which to apply job priorities.
-
generateCleanupID
protected java.lang.String generateCleanupID(Job job)
Returns the identifier that is to be assigned to cleanup job.- Parameters:
job
- the job with which the cleanup job is primarily associated.- Returns:
- the identifier for a cleanup job.
-
generateClusteredJobID
public java.lang.String generateClusteredJobID(java.lang.String site, int level, int index)
Generated an ID for a clustered cleanup job- Parameters:
site
- the site associated with the cleanup jobslevel
- the level of the workflowindex
- the index of the job on that level- Returns:
-
typeNeedsCleanUp
protected boolean typeNeedsCleanUp(GraphNode node)
Checks to see which job types are required to be looked at for cleanup. COMPUTE_JOB , STAGE_OUT_JOB , INTER_POOL_JOB are the ones that need cleanup- Parameters:
node
- the graph node- Returns:
- boolean
-
typeNeedsCleanUp
protected boolean typeNeedsCleanUp(int type)
Checks to see which job types are required to be looked at for cleanup. COMPUTE_JOB , STAGE_OUT_JOB , INTER_POOL_JOB are the ones that need cleanup- Parameters:
type
- the type of the job.- Returns:
- boolean
-
typeStageOut
protected boolean typeStageOut(int type)
Checks to see if job type is a stageout job type.- Parameters:
type
- the type of the job.- Returns:
- boolean
-
getSiteForCleanup
protected java.lang.String getSiteForCleanup(Job job)
Returns site to be used for the cleanup algorithm. For compute jobs the staging site is used, while for stageout jobs is used. For all other jobs the execution site is used.- Parameters:
job
- the job- Returns:
- the site to be used
-
getDefaultCleanupMaxJobsPropertyKey
public java.lang.String getDefaultCleanupMaxJobsPropertyKey()
Returns the property key that can be used to set the max jobs for the default category associated with the registration jobs.- Returns:
- the property key
-
clusterCleanupGraphNodes
private java.util.List<GraphNode> clusterCleanupGraphNodes(java.util.List<GraphNode> cleanupNodes, java.util.HashMap cleanedBy, java.lang.String site, int level)
Takes in a list of cleanup nodes ,one per cleanupNode(compute/stageout job) whose files need to be deleted) and clusters them into a smaller set of cleanup nodes.- Parameters:
cleanupNodes
- List of stub cleanup nodes created corresponding to a job in the workflow that needs cleanup. the cleanup jobs have content as a CleanupJobContentcleanedBy
- a map that tracks which file was deleted by which cleanup jobsite
- the site associated with the cleanup jobslevel
- the level of the workflow- Returns:
- a set of clustered cleanup nodes
-
createClusteredCleanupGraphNode
private GraphNode createClusteredCleanupGraphNode(java.util.List<GraphNode> nodes, java.util.HashMap cleanedBy, java.lang.String site, int level, int index)
Creates a clustered cleanup graph node that aggregates multiple cleanup nodes into one node- Parameters:
nodes
- list of cleanup nodes that are to be aggregatedcleanedBy
- a map that tracks which file was deleted by which cleanup jobsite
- the site associated with the cleanup jobslevel
- the level of the workflowindex
- the index of the cleanup job for that level- Returns:
- a clustered cleanup node with the appropriate linkages added to the workflow else, null if the clustered cleanup node has no files to delete
-
getClusterSize
private int getClusterSize(int size)
Returns the number of cleanup jobs clustered into one job per level.- Parameters:
size
- the number of cleanup jobs created by the algorithm before clustering for the level.- Returns:
- the number of clustered cleanup jobs to be created for the level
-
-