SlideShare a Scribd company logo
Usage Patterns to Provision for ScientificExperimentation in CloudsEran Chinthaka Withana and Beth PlaleSchool of Informatics and Computing, Indiana UniversityBloomington, Indiana, USA.2nd International Conference on Cloud Computing Technology and Science, Indianapolis, IN, US
SummaryDoing Science in CloudImproving Scientific Job Executions in Cloud ResourcesRole of Successful Predictions to Reduce Startup OverheadsSystem ArchitectureUse of ReasoningEvaluationDiscussion and Future Work2
Clouds as a Complementary Solution to Grids for ScienceIssues with existing systemsBatch oriented HPC resources with long queue wait times, even under moderate loadsNo access transparency Quota system requires maximum resources to be known and approved in advanceAdvantages of using cloud resourcesAvailability of “unlimited” compute resources the instant they are neededPay-as-you-go model eliminates up-front commitmentsEncourages scientists to budget for the resources they are willing to payIssues with CloudsSlow interconnects virtualization overhead and startup timesConsumption based billingEmergence of new programming paradigms to exploit the advantages of Cloud resources3
Challenges with Cloud Computing ResourcesScheduling algorithmsFocused on optimal utilization of relatively homogeneous grid or cluster resourcesResources can be provisioned supporting user requirements in cloudsPrediction AlgorithmsDifferent hardware configurations forces execution time predictions to factor non-uniformity of resources 4
Improving Scientific Job Executions in Cloud ResourcesSolution SpaceMeta-scheduler that uses historical information to anticipate future activity (AppleS, GRADS)Resource abstraction service (Nimrod/G)Reducing the impact of startup overheads, learning from user behavioral patterns, by predicting future jobsTalk outlineAlgorithm to predict future jobs by extracting user patterns from historical informationReduces the impact of high startup overheads for time-critical applicationsUse of knowledge-based techniquesZero knowledge or pre-populated job information consisting of connection between jobsSimilar cases retrieved are used to predict future jobs, reducing high startup overheadsAlgorithm assessment Two different workloads representing individual scientific jobs executed in LANL and set of workflows executed by three users5
Use CaseSuite of workflows can differ from domain to domainWRF (Weather Research and Forecasting) as upstream nodeMeteorologists will run pre-processing jobs to generate visualization of parametersIn Agriculture, scientists will use for crop predictionWild-fire propagation and predictionGenerate visualizations for mobile phones using NCL scriptsAtmospheric Scientists for optimal placement of wind farmsUser patterns reveal the sequence of jobs taking different users/domains into considerationUseful for a science gateway serving wide-range of mid-scale scientists6Weather PredictionsCrop PredictionsWRFWind Farm Location EvaluationsWild Fire Propagation Simulation
Role of Successful Predictions to Reduce Startup OverheadsLargest gain can be achieved when our prediction accuracy is high and setup time (s) is large with respect to execution time (t)r = probability of successful prediction (prediction accuracy)Percentage time  =reductionFor simplicity, assuming equal job exec and startup times Percentage time  =reduction7
Relationship of Predictions to Execution TimeObservationsPercentage time reduction increases with accuracy of predictionsTime reduction is reduced exponentially with increased work-to-overhead ratioNeed to find the criticalpoint for a given situationFixing the required percentage time reduction for a given t/s ratio and finding the required accuracy of predictionsCost of wrong predictionsDepends on compute resourcePercentage time  =reduction8Accuracy of Predictions =          total successful future job predictions / total predictions
Prediction Engine: System ArchitecturePredictionRetriever9
Use of ReasoningStore and retrieve casesStepsRetrieval of similar casesSimilarity measurementUse of thresholdsReuse of old casesCase adaptationStorage10
Case Similarity CalculationEach case is represented using set of attributesSelected by finding the effect on goal variable (next job)11
Evaluation1Use casesIndividual job workload140k jobs over two years from 1024-node CM-5 at Los Alamos National LabWorkflow use case1: Parallel Workload Archive http://www.cs.huji.ac.il/labs/parallel/workload/ 12
Evaluation: Average Accuracy of Predictions13Individual Jobs WorkloadWorkflow Workload
Evaluation: Time SavedAmount of time that can be saved, if the resources are provisioned, when the job is ready to runStartup timeAssumed to be 3mins (average for commercial providers)14Individual Jobs WorkloadWorkflow Workload
Evaluation: Prediction Accuracies for Use Cases15
Discussion and Future WorkAccuracy 78% for individual jobs96% for workflow workloadNumber of jobs required to make system stable depends on uniqueness and the distribution of unique applicationsAmount of time that can be saved, using future job prediction, is inversely proportional to t/s ratioMore accurate methods to prune features and identify weightsEvaluation of machine learning techniques as an alternative to knowledge-based systemsCombining future job predictions with job reliability predictions to further improve throughput of job executions16
Related Work[1] M. Armbrust et al., “Above the clouds: A berkeley view of cloud computing,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009.[2] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. [3] C. Catlett, “The philosophy of TeraGrid: building an open, extensible, distributed TeraScale facility,” in ACM International Symposium on Cluster Computing and the Grid. Published by the IEEE Computer Society, 2002.[4] J. S. Chase, D. E. Irwin, L. E. Grit, J. D. Moore, and S. Sprenkle, “Dynamic virtual clusters in a grid site manager.” in HPDC. IEEE Computer Society, 2003, pp. 90–103. [5] R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes, “A case for grid computing on virtual machines,” in ICDCS ’03: Proceedings of the 23rd International Conference on Distributed Computing Systems. Washington, DC, USA: IEEE Computer Society, 2003, p. 550.[6] I. Foster, T. Freeman, K. Keahy, D. Scheftner, B. Sotomayer, and X. Zhang, “Virtual clusters for grid communities,” in CCGRID ’06: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid.  Washington, DC, USA: IEEE Computer Society, 2006, pp. 513–520.[7] K. Keahey, T. Freeman, J. Lauret, and D. Olson, “Virtual workspaces for scientific applications,” Journal of Physics: Conference Series, vol. 78, p. 012038 (5pp), 2007.[8] B. Sotomayor, K. Keahey, and I. Foster, “Overhead matters: A model for virtual resource management,” in VTDC ’06: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing. Washington, DC, USA: IEEE Computer Society, 2006, p. 5.  ………………………………………………………….[12] F. Berman et al., “Adaptive computing on the grid using apples,” IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 4, pp. 369–382, 2003. [13] F. Berman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, J. Mellor-Crumme et al., “The GrADS project: Software support for high-level grid application development,” International Journal of High Performance Computing Applications, vol. 15, no. 4, p. 327, 2001.[14] R. Buyya, D. Abramson, and J. Giddy, “Nimrod/G: An architecture for a resource management and scheduling system in a global computational grid,” in hpc. Published by the IEEE Computer Society, 2000, p. 283.17
Thank You !!

More Related Content

What's hot (20)

Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
Robert Grossman
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
Nexgen Technology
 
Nephele pegasus
Nephele pegasusNephele pegasus
Nephele pegasus
Somnath Mazumdar
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
Robert Grossman
 
Volume 2-issue-6-1933-1938
Volume 2-issue-6-1933-1938Volume 2-issue-6-1933-1938
Volume 2-issue-6-1933-1938
Editor IJARCET
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
Robert Grossman
 
Big data and open access: a collision course for science
Big data and open access: a collision course for scienceBig data and open access: a collision course for science
Big data and open access: a collision course for science
Beth Plale
 
Demand-driven Gaussian window optimization for executing preferred population...
Demand-driven Gaussian window optimization for executing preferred population...Demand-driven Gaussian window optimization for executing preferred population...
Demand-driven Gaussian window optimization for executing preferred population...
IJECEIAES
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
Editor IJMTER
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
LeMeniz Infotech
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
Oscar Corcho
 
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
ijdpsjournal
 
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloudEnergy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Linda J
 
Paper444012-4014
Paper444012-4014Paper444012-4014
Paper444012-4014
saumya yuval
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
DIGVIJAY SHINDE
 
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc NetworkHandling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
IJCERT
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
ijujournal
 
Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
Ramandeep Kaur
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
Robert Grossman
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
Nexgen Technology
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
Robert Grossman
 
Volume 2-issue-6-1933-1938
Volume 2-issue-6-1933-1938Volume 2-issue-6-1933-1938
Volume 2-issue-6-1933-1938
Editor IJARCET
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
Robert Grossman
 
Big data and open access: a collision course for science
Big data and open access: a collision course for scienceBig data and open access: a collision course for science
Big data and open access: a collision course for science
Beth Plale
 
Demand-driven Gaussian window optimization for executing preferred population...
Demand-driven Gaussian window optimization for executing preferred population...Demand-driven Gaussian window optimization for executing preferred population...
Demand-driven Gaussian window optimization for executing preferred population...
IJECEIAES
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
Editor IJMTER
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
LeMeniz Infotech
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
Oscar Corcho
 
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
A LIGHT-WEIGHT DISTRIBUTED SYSTEM FOR THE PROCESSING OF REPLICATED COUNTER-LI...
ijdpsjournal
 
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloudEnergy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Linda J
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
DIGVIJAY SHINDE
 
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc NetworkHandling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
IJCERT
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
ijujournal
 
Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
Ramandeep Kaur
 

Similar to Usage Patterns to Provision for Scientific Experiments in Clouds (20)

The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
Frederic Desprez
 
cloud-Application-Presentation-Virtual-Machine.pptx
cloud-Application-Presentation-Virtual-Machine.pptxcloud-Application-Presentation-Virtual-Machine.pptx
cloud-Application-Presentation-Virtual-Machine.pptx
ROHITAHUJA66
 
A Review: Metaheuristic Technique in Cloud Computing
A Review: Metaheuristic Technique in Cloud ComputingA Review: Metaheuristic Technique in Cloud Computing
A Review: Metaheuristic Technique in Cloud Computing
IRJET Journal
 
A Review on Cloud Computing.pdf
A Review on Cloud Computing.pdfA Review on Cloud Computing.pdf
A Review on Cloud Computing.pdf
Charlie Congdon
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the grid
Jivan Nepali
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
eSAT Publishing House
 
Energy efficient virtual machine (vm) migration in cloud data centers
Energy efficient virtual machine (vm) migration in cloud data centersEnergy efficient virtual machine (vm) migration in cloud data centers
Energy efficient virtual machine (vm) migration in cloud data centers
Dinesh Raj Paneru
 
Software aging prediction – a new approach
Software aging prediction – a new approach Software aging prediction – a new approach
Software aging prediction – a new approach
IJECEIAES
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Editor IJCATR
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
inside-BigData.com
 
Ax34298305
Ax34298305Ax34298305
Ax34298305
IJERA Editor
 
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
IJERA Editor
 
(5 10) chitra natarajan
(5 10) chitra natarajan(5 10) chitra natarajan
(5 10) chitra natarajan
IISRTJournals
 
An advanced ensemble load balancing approach for fog computing applications
An advanced ensemble load balancing approach for fog computing applicationsAn advanced ensemble load balancing approach for fog computing applications
An advanced ensemble load balancing approach for fog computing applications
IJECEIAES
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
marpierc
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)
IJCSEA Journal
 
Resource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmResource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling Algorithm
IRJET Journal
 
CCCORE: Cloud Container for Collaborative Research
CCCORE: Cloud Container for Collaborative Research CCCORE: Cloud Container for Collaborative Research
CCCORE: Cloud Container for Collaborative Research
IJECEIAES
 
J41046368
J41046368J41046368
J41046368
IJERA Editor
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
cloud-Application-Presentation-Virtual-Machine.pptx
cloud-Application-Presentation-Virtual-Machine.pptxcloud-Application-Presentation-Virtual-Machine.pptx
cloud-Application-Presentation-Virtual-Machine.pptx
ROHITAHUJA66
 
A Review: Metaheuristic Technique in Cloud Computing
A Review: Metaheuristic Technique in Cloud ComputingA Review: Metaheuristic Technique in Cloud Computing
A Review: Metaheuristic Technique in Cloud Computing
IRJET Journal
 
A Review on Cloud Computing.pdf
A Review on Cloud Computing.pdfA Review on Cloud Computing.pdf
A Review on Cloud Computing.pdf
Charlie Congdon
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the grid
Jivan Nepali
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
eSAT Publishing House
 
Energy efficient virtual machine (vm) migration in cloud data centers
Energy efficient virtual machine (vm) migration in cloud data centersEnergy efficient virtual machine (vm) migration in cloud data centers
Energy efficient virtual machine (vm) migration in cloud data centers
Dinesh Raj Paneru
 
Software aging prediction – a new approach
Software aging prediction – a new approach Software aging prediction – a new approach
Software aging prediction – a new approach
IJECEIAES
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Editor IJCATR
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
inside-BigData.com
 
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
IJERA Editor
 
(5 10) chitra natarajan
(5 10) chitra natarajan(5 10) chitra natarajan
(5 10) chitra natarajan
IISRTJournals
 
An advanced ensemble load balancing approach for fog computing applications
An advanced ensemble load balancing approach for fog computing applicationsAn advanced ensemble load balancing approach for fog computing applications
An advanced ensemble load balancing approach for fog computing applications
IJECEIAES
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
marpierc
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)
IJCSEA Journal
 
Resource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmResource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling Algorithm
IRJET Journal
 
CCCORE: Cloud Container for Collaborative Research
CCCORE: Cloud Container for Collaborative Research CCCORE: Cloud Container for Collaborative Research
CCCORE: Cloud Container for Collaborative Research
IJECEIAES
 

More from Eran Chinthaka Withana (7)

Cassandra At Wize Commerce
Cassandra At Wize CommerceCassandra At Wize Commerce
Cassandra At Wize Commerce
Eran Chinthaka Withana
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundation
Eran Chinthaka Withana
 
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Eran Chinthaka Withana
 
Versioning for Workflow Evolution
Versioning for Workflow EvolutionVersioning for Workflow Evolution
Versioning for Workflow Evolution
Eran Chinthaka Withana
 
Web Services in the Real World
Web Services in the Real WorldWeb Services in the Real World
Web Services in the Real World
Eran Chinthaka Withana
 
Axis2 Landscape
Axis2 LandscapeAxis2 Landscape
Axis2 Landscape
Eran Chinthaka Withana
 
CBR Based Workflow Composition Assistant
CBR Based Workflow Composition AssistantCBR Based Workflow Composition Assistant
CBR Based Workflow Composition Assistant
Eran Chinthaka Withana
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundation
Eran Chinthaka Withana
 
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Towards Enabling Mid-Scale Geo-Science Experiments Through Microsoft Trident ...
Eran Chinthaka Withana
 
CBR Based Workflow Composition Assistant
CBR Based Workflow Composition AssistantCBR Based Workflow Composition Assistant
CBR Based Workflow Composition Assistant
Eran Chinthaka Withana
 

Usage Patterns to Provision for Scientific Experiments in Clouds

  • 1. Usage Patterns to Provision for ScientificExperimentation in CloudsEran Chinthaka Withana and Beth PlaleSchool of Informatics and Computing, Indiana UniversityBloomington, Indiana, USA.2nd International Conference on Cloud Computing Technology and Science, Indianapolis, IN, US
  • 2. SummaryDoing Science in CloudImproving Scientific Job Executions in Cloud ResourcesRole of Successful Predictions to Reduce Startup OverheadsSystem ArchitectureUse of ReasoningEvaluationDiscussion and Future Work2
  • 3. Clouds as a Complementary Solution to Grids for ScienceIssues with existing systemsBatch oriented HPC resources with long queue wait times, even under moderate loadsNo access transparency Quota system requires maximum resources to be known and approved in advanceAdvantages of using cloud resourcesAvailability of “unlimited” compute resources the instant they are neededPay-as-you-go model eliminates up-front commitmentsEncourages scientists to budget for the resources they are willing to payIssues with CloudsSlow interconnects virtualization overhead and startup timesConsumption based billingEmergence of new programming paradigms to exploit the advantages of Cloud resources3
  • 4. Challenges with Cloud Computing ResourcesScheduling algorithmsFocused on optimal utilization of relatively homogeneous grid or cluster resourcesResources can be provisioned supporting user requirements in cloudsPrediction AlgorithmsDifferent hardware configurations forces execution time predictions to factor non-uniformity of resources 4
  • 5. Improving Scientific Job Executions in Cloud ResourcesSolution SpaceMeta-scheduler that uses historical information to anticipate future activity (AppleS, GRADS)Resource abstraction service (Nimrod/G)Reducing the impact of startup overheads, learning from user behavioral patterns, by predicting future jobsTalk outlineAlgorithm to predict future jobs by extracting user patterns from historical informationReduces the impact of high startup overheads for time-critical applicationsUse of knowledge-based techniquesZero knowledge or pre-populated job information consisting of connection between jobsSimilar cases retrieved are used to predict future jobs, reducing high startup overheadsAlgorithm assessment Two different workloads representing individual scientific jobs executed in LANL and set of workflows executed by three users5
  • 6. Use CaseSuite of workflows can differ from domain to domainWRF (Weather Research and Forecasting) as upstream nodeMeteorologists will run pre-processing jobs to generate visualization of parametersIn Agriculture, scientists will use for crop predictionWild-fire propagation and predictionGenerate visualizations for mobile phones using NCL scriptsAtmospheric Scientists for optimal placement of wind farmsUser patterns reveal the sequence of jobs taking different users/domains into considerationUseful for a science gateway serving wide-range of mid-scale scientists6Weather PredictionsCrop PredictionsWRFWind Farm Location EvaluationsWild Fire Propagation Simulation
  • 7. Role of Successful Predictions to Reduce Startup OverheadsLargest gain can be achieved when our prediction accuracy is high and setup time (s) is large with respect to execution time (t)r = probability of successful prediction (prediction accuracy)Percentage time =reductionFor simplicity, assuming equal job exec and startup times Percentage time =reduction7
  • 8. Relationship of Predictions to Execution TimeObservationsPercentage time reduction increases with accuracy of predictionsTime reduction is reduced exponentially with increased work-to-overhead ratioNeed to find the criticalpoint for a given situationFixing the required percentage time reduction for a given t/s ratio and finding the required accuracy of predictionsCost of wrong predictionsDepends on compute resourcePercentage time =reduction8Accuracy of Predictions = total successful future job predictions / total predictions
  • 9. Prediction Engine: System ArchitecturePredictionRetriever9
  • 10. Use of ReasoningStore and retrieve casesStepsRetrieval of similar casesSimilarity measurementUse of thresholdsReuse of old casesCase adaptationStorage10
  • 11. Case Similarity CalculationEach case is represented using set of attributesSelected by finding the effect on goal variable (next job)11
  • 12. Evaluation1Use casesIndividual job workload140k jobs over two years from 1024-node CM-5 at Los Alamos National LabWorkflow use case1: Parallel Workload Archive http://www.cs.huji.ac.il/labs/parallel/workload/ 12
  • 13. Evaluation: Average Accuracy of Predictions13Individual Jobs WorkloadWorkflow Workload
  • 14. Evaluation: Time SavedAmount of time that can be saved, if the resources are provisioned, when the job is ready to runStartup timeAssumed to be 3mins (average for commercial providers)14Individual Jobs WorkloadWorkflow Workload
  • 16. Discussion and Future WorkAccuracy 78% for individual jobs96% for workflow workloadNumber of jobs required to make system stable depends on uniqueness and the distribution of unique applicationsAmount of time that can be saved, using future job prediction, is inversely proportional to t/s ratioMore accurate methods to prune features and identify weightsEvaluation of machine learning techniques as an alternative to knowledge-based systemsCombining future job predictions with job reliability predictions to further improve throughput of job executions16
  • 17. Related Work[1] M. Armbrust et al., “Above the clouds: A berkeley view of cloud computing,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009.[2] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. [3] C. Catlett, “The philosophy of TeraGrid: building an open, extensible, distributed TeraScale facility,” in ACM International Symposium on Cluster Computing and the Grid. Published by the IEEE Computer Society, 2002.[4] J. S. Chase, D. E. Irwin, L. E. Grit, J. D. Moore, and S. Sprenkle, “Dynamic virtual clusters in a grid site manager.” in HPDC. IEEE Computer Society, 2003, pp. 90–103. [5] R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes, “A case for grid computing on virtual machines,” in ICDCS ’03: Proceedings of the 23rd International Conference on Distributed Computing Systems. Washington, DC, USA: IEEE Computer Society, 2003, p. 550.[6] I. Foster, T. Freeman, K. Keahy, D. Scheftner, B. Sotomayer, and X. Zhang, “Virtual clusters for grid communities,” in CCGRID ’06: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid. Washington, DC, USA: IEEE Computer Society, 2006, pp. 513–520.[7] K. Keahey, T. Freeman, J. Lauret, and D. Olson, “Virtual workspaces for scientific applications,” Journal of Physics: Conference Series, vol. 78, p. 012038 (5pp), 2007.[8] B. Sotomayor, K. Keahey, and I. Foster, “Overhead matters: A model for virtual resource management,” in VTDC ’06: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing. Washington, DC, USA: IEEE Computer Society, 2006, p. 5. ………………………………………………………….[12] F. Berman et al., “Adaptive computing on the grid using apples,” IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 4, pp. 369–382, 2003. [13] F. Berman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, J. Mellor-Crumme et al., “The GrADS project: Software support for high-level grid application development,” International Journal of High Performance Computing Applications, vol. 15, no. 4, p. 327, 2001.[14] R. Buyya, D. Abramson, and J. Giddy, “Nimrod/G: An architecture for a resource management and scheduling system in a global computational grid,” in hpc. Published by the IEEE Computer Society, 2000, p. 283.17
  翻译: