Progress Report 9/9. CHT Project Develop a resource management scheduling algorithm for CHT...
Transcript of Progress Report 9/9. CHT Project Develop a resource management scheduling algorithm for CHT...
Progress Report9/9
CHT ProjectDevelop a resource management
scheduling algorithm for CHT datacenter.◦Two types of jobs, interactive/latency-sensitive and batch/computation-intensive.
◦Meet the SLA of jobs while maximizing the server utilization.Minimize SLA violation with limited resources. # of servers
Target PlatformCHT datacenter
◦A number of heterogeneous servers Red Hat Openshift
ImplementationMonitor, controller/decision
maker◦Stand-alone modules
Scheduler◦Openshift plug-in
方案 1 方案 2 方案 3
實作方式
Monitor 使用 Openshift 內建機制,並使用 JVM console 等方式讀出。
第三方 monitor 工具 自行開發 (AP/OS level)
Scheduler
實作本計畫演算法 實作本計畫演算法 以 Plugin 方式介接Openshift
Controller
透過 OS API 調整系統設定 透過 OS API 調整系統設定 使用 Openshift 內建機制
是否需修改 Openshift 原始碼
不需修改 不需修改 不需修改 ( 替換原scheduler)
優點 1. 實作簡單,只需實作scheduler.
2. 不須改動 Openshift 原先 scheduler.
1. 不須改動 Openshift 原先 scheduler.
2. 可自行選用不同的monitor 工具
3. 不須改動 Openshift 原先 scheduler.
1. 是 Openshift 內建的 Scheduler 介接方式 (CHT)
2. 介接方式不複雜 (CHT)
缺點 1. 1. 尚不確定 Service
Layer 及 Routing
layer 有合適的 API 可以使用 (CHT)
1. 尚不確定 Service Layer
及 Routing layer 有合適的 API 可以使用 (CHT)
1. Plugin 需使用 go 語言開發 (CHT)
IdeaOne interactive task per server.
Spare resource monitoring.◦Monitor the status of interactive
tasks on a server.◦Deploy batch tasks to servers with
spare resources.◦Separate the tasks from the same
job in order to avoid resource contention.
Pseudo Code 1. findServer(newContainer): 2. foreach server in serverList: 3. if( server.canHost(newContainer) ): 4. candidate <- server 5. 6. foreach server in candidate: 7. compute server.Score(newContainer) 8. 9. target = maxScore(candidate) 10. return target ## Deploy newContainer to target
server
Some IssuesMeasuring the loading of
interactive jobs◦CHT will try to provide agents in
container that return the (critical) resource usage of the APs.
Size of container◦Fixed for the same job.◦Varies among different jobs.
Some Issues(Cont.)Candidate while choosing server
◦May encounter restrictions such as # of license.
Score function while choosing server◦Assign tasks/containers to the server
with the highest score.◦Score = w1*activeServer +
w2*spareResource + w3*(-#otherInJobConatiner) + w4*…
Some Issues(Cont.)Hardware ability (by Prof. Lin)
◦Using the loading may be problematic. Example:
Assume α CPU with computing ability S, while β CPU with 2 S. Host OS consumes 0.25 S. The container consumes S to meet QoS. If we deploy the same container to α and β, the loading (from the container perspective) may be 75% vs 50%, while the container on α violating the QoS (insufficient resource).
◦Not sure, will verify.
NextKeep working