The-Design-and-Architecture-of-the-Microsoft-Cluster-Service微软的集群服务设计和结构-精选课件.ppt
- 【下载声明】
1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
3. 本页资料《The-Design-and-Architecture-of-the-Microsoft-Cluster-Service微软的集群服务设计和结构-精选课件.ppt》由用户(晟晟文业)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- The Design and Architecture of Microsoft Cluster Service 微软 集群 服务 设计 结构 精选 课件
- 资源描述:
-
1、The Design and Architecture of the Microsoft Cluster Service(MSCS)-W.Vogels et al.ECE 845 PresentationBySandeep TamboliApril 18,20001Outline Prerequisites Introduction Design Goals Cluster Abstractions Cluster Operation Cluster Architecture Implementation Examples Summary2Prerequisites Availability=
2、MTTF/(MTTF+MTTR)MTTF:Mean Time To Failure MTTR:Mean Time To Repair High Availability:Modern taxonomy of High Availability:A system having sufficient redundancy in components to mask certain defined faults,has High Availability(HA).IBM High Availability Services:The goals of high availability solutio
3、ns are to minimize both the number of service interruptions and the time needed to recover when an outage does occur.High availability is not a specific technology nor a quantifiable attribute;it is a goal to be reached.This goal is different for each system and is based on the specific needs of the
4、 business the system supports.The presenter:May have degraded performance while a component is down3MSCS(a.k.a.Wolfpack)Extension of Windows NT to improve availability First phase of implementation Scalability limited up to 2 nodes MSCS features:Fail over Migration Automated restart Differences with
5、 previous HA solutions:Simpler User Interface More sophisticated modeling of applications Tighter integration with the OS(NT)4MSCS(2)Shared nothing cluster model:Each node owns a subset of cluster resources Only one node may own a resource at a time On failure,another node may take the resource owne
6、rship5Design Goals Commodity Commercial-off-the-shelf nodes Windows NT server Standard Internet protocols Scalability Transparency Presented as a single system to the clients System management tools manage as if a single server Service and system execution information available in single cluster wid
7、e log6Design Goals(2)Availability On failure detection Restart application on another node Migrate other resources ownership Restart policy can specify availability requirements of the application Hardware/software upgrades possible in phased manner7Cluster Abstractions Node:Runs an instance of Clus
8、ter Service Defined and active Resource Functionality offered at a node Physical:printer Logical:IP address Applications implement logical resources Exchange mail database SAP applications Quorum Resource Persistent storage for Cluster Configuration Database Arbitration mechanism to control membersh
9、ip Partition on a fault tolerant shared SCSI disk8Cluster Abstractions(2)Resource Dependencies Dependency trees:Sequence to bring resources online Resource Groups Unit of migration Virtual servers Application runs within virtual server environment Illusion to applications,administrators,and clients
10、of a single stable environment Client connects using virtual server name Enables many application instances to run on a same physical node9Cluster Abstractions(3)Cluster Configuration Database Replicated at each node Accessed through NT registry updates applied using Global Update Protocol10Cluster
11、Membership OperationOfflineStart ClusterService FailsCluster ServiceStartedMemberSearchJoiningPausedOnlineExitingSleepingQuorumDisk SearchFormingResumePauseJoinSucceedsJoin FailsFoundOnlineMemberSearch FailsSearch FailsEvict or Leave ClusterShutdown System Stop Cluster ServiceSynchronizeSucceedsTime
12、outRetriesExceededCompleteRundownQuorumDiskFoundInitializingKey:-Externally visibile state-Internal state11Member Join Sponsor broadcasts the identity of the joining node Sponsor informs the joining node about Current membership Cluster configuration database Joining members heartbeats start Sponsor
13、 waits for the first heartbeat Sponsor signals the other nodes to consider the joining node a full member Acknowledgement is sent to the joining node On failure,Join operation aborted Joining node removed from the membership12Member Regroup Upon suspicion that an active node has failed,member regrou
14、p operation is executed to detect any membership changes Reasons for suspicion:missing heartbeats power failures The regroup algorithm moves each node through 6 stages Each node sends periodic messages to all other nodes,indicating which stage it has finished Barrier synchronization13Regroup Algorit
15、hmActivate:After a local clock tick,each node sends and collects status messages Node advances if all responses collected or timeout occursClosing:It is determined if partitions exist and if current nodes partition should survivePruning:All nodes that are pruned for lack of connectivity,haltCleanup
16、phase one:All the surviving nodes Install new membership Mark the halted nodes as inactive Inform the cluster network manager to filter out halted nodes messages Make event manager invoke local callback handlers announcing node failuresCleanup phase two:A second cleanup callback is invoked to allow
17、a coordinated two-phase cleanupStabilized:The regroup has finished14Partition SurvivalA partition survives if any of the following is satisfied:n(new membership)1/2*n(original membership)Following three conditions satisfied together n(new membership)=1/2*n(Original membership)n(new membership)2 tieb
18、reaker node (new membership)Following three conditions satisfied together n(original membership)=2 n(new membership)=1 quorum disk (new membership)15Resource ManagementResource control DLL for each type of resourcePolymorphic design allows easy management of varied resource typesResource state trans
19、ition diagram:OfflineOnline-pendingFailedOffline-pendingOnlineRequest to offlineRequest to offlineRequest to onlineInit failedInit completeShutdown complete16Resource Migration:Pushing a group Executed when Resource failure at the original node Resource group prefers to execute at other node Adminis
展开阅读全文
链接地址:https://www.163wenku.com/p-3761716.html