توجه: محتویات این صفحه به صورت خودکار پردازش شده و مقاله‌های نویسندگانی با تشابه اسمی، همگی در بخش یکسان نمایش داده می‌شوند.
۱A Review on Fault Tolerance Techniques for High Performance Computing
نویسنده(ها): ،
اطلاعات انتشار: همایش ملی مهندسی رایانه و مدیریت فناوری اطلاعات، سال
تعداد صفحات: ۷
Cloud computing is the next generation computing. There are new capacity and flexibilityto HPC (High Performance Computing) applications with using large number of virtual machines forcomputational intensive applications. Today’s high performance computing systems are typicallymanaged and operated by individual organizations in private. A cloud–based Infrastructure–as–a–Service (IaaS) approach for high performance computing applications promises cost savings andmore flexibility. High performance computing (HPC) systems may fail because of large workloadand number of servers. Fault tolerance techniques allow HPC systems on cloud to executecomputational intensive application with multiple of nodes. Fault tolerance can provide bestperformance of tasks in the presence of hardware and software faults. However, main failures aremostly hardware based. Also, system availability is very important and fault tolerance techniquesused to detect and predict faults. This paper gives an overview on most popular fault tolerancetechniques in HPC, prediction models and tools used in HPC.<\div>

۲A Review on Fault Tolerance Techniques for HighPerformance Computing
نویسنده(ها): ،
اطلاعات انتشار: کنفرانس ملی علوم مهندسی، ایده های نو (8)، سال
تعداد صفحات: ۴
Cloud computing is the next generation computing. There are new capacity and flexibility to HPC (High Performance Computing) applications with using large number of virtual machines for computational intensive applications.Today s high performance computing systems are typically managed and operated by individual organizations in private. A cloud–based Infrastructure–as–a–Service (IaaS) approach for high performance computing applications promises cost savings and more flexibility. High performance computing(HPC) systems may fail because of large workload and number of servers. Fault tolerance techniques allow HPC systems on cloud to execute computational intensive application with multiple of nodes. Fault tolerance can provide best performance of tasks in the presence of hardware and softwarefaults. However, main failures are mostly hardware based. Also, system availability is very important and fault tolerance techniques used to detect and predict faults. This paper gives an overview on most popular fault tolerance techniques in HPC, prediction models and tools used in HPC.<\div>
نمایش نتایج ۱ تا ۲ از میان ۲ نتیجه