Performance Methods for Distributed and Cloud Software

Funding Details
Natural Sciences and Engineering Research Council of Canada
  • Grant type: Discovery Grants Program - Individual
  • Years: 2016/17 to 2019/20
  • Total Funding: $104,000
Keywords
Principle Investigator(s)
Collaborator(s)

No researchers found.

Partners

No partner organizations found.

Project Summary

Performance models of distributed software systems can be used in many ways: (1) to guide the software design or refactoring, (2) to plan the system deployment, (3) for optimization of resource use. Layered queueing network models (LQNs) are ideal for this analysis since they attach resource usage to objects directly corresponding to software artifacts, and because they describe contention for logical as well as physical resources. All three uses require models that are quickly and easily created and calibrated to track the product development, changes in the deployment environment or usage levels. For example, DevOps (development with continuous releases) requires continuous re-modeling, which is provided by the proposer's existing methods to generate models automatically from software designs (students Dorin Petriu, Nariman Mani etc). Features of the deployment environment are included as “performance completions” (work with student Adnan Faisal and Prof Dorina Petriu). To adapt a model the proposer has applied nonlinear regression techniques to quickly calibrate LQN models, and statistical tracking filters to track dynamic changes in program behaviour. To exploit the models for cloud management he has applied linear and integer programming to optimize deployment. The proposed research will be based on these capabilities. The first broad area, and the majority of the proposal, addresses the creation of models, and the fundamental unsolved problem of choosing the model structure. System knowledge often provides a complex model with elements that are not essential for performance evaluation, difficult to understand and to calibrate and with long solution times. Current research (student Farhana Islam) is developing procedures to simplify a complex model. The proposed research addresses how the simplification of a given system changes depending on system usage and user goals, using a novel concept of the utility of a model for solving a given performance problem. Adapting the simplified model as the system changes will modify its parameters and also revise the simplified structure, extending previous work in parameter tracking to "structure tracking". The second broad area considers the use of the models. The study of model utility will characterize performance problems in a general way. A significant emerging problem is optimization of the deployment of large systems over multiple service centers or multi-clouds, with substantial intercenter network latencies. Our solutions for a single cloud (student Jim Li) do not generalize to the latency, but heuristics by student Ravneet Kaur for simple cases solve this problem. They involve graph partitioning with coarsening and refining, combined with bin-packing. Practical problems will require extension.