No products in the cart.
❖ Cloud computing is synonymous to high-performance computing. Hence, file system and file
processing characteristics of high-performance computing environment are also applicable in
cloud computing.
❖ Efficient processing of large data-sets is critical for success of high-performance computing
systems. Large and complex data-sets are generated and produced in cloud every now and
then.
❖ High-performance processing of large data-sets requires parallel execution of partitioned
data across distributed computing nodes. This facility should be enabled with suitable data
processing programming models and other supporting file systems.
❖ Several programming models have been developed for high-performance processing of large
data-sets. Among them, Google’s MapReduce is a well-accepted model for processing massive
amounts of unstructured data in parallel across a distributed processing environment.
❖ Several other models have emerged influenced by the MapReduce model. Among them
Hadoop, Pig and Hive are a few to mention.
❖ Among the various file systems to support high-performance processing of data, Google File
System (GFS) is considered as the pioneer. The open-source Hadoop Distributed File System
(HDFS) is inspired by GFS.236
Cloud Computing
❖ Storage in cloud is delivered in two categories: for general users and for developers. Storage
for general users are delivered as SaaS and for the developers it is delivered as IaaS.
❖ For general users, cloud provides ready-to-use storage which is usually managed by the
providers. Hence, users can directly use the storage without worrying about any kind of
processing of the storage. Such storages are known as ‘unmanaged’ storage type.
❖ Managed storages are raw storages which are built to be managed by the users themselves.
Computing developers use such kind of storages.
❖ Cloud computing is synonymous to high-performance computing. Hence, file system and file
processing characteristics of high-performance computing environment are also applicable in
cloud computing.
❖ Efficient processing of large data-sets is critical for success of high-performance computing
systems. Large and complex data-sets are generated and produced in cloud every now and
then.
❖ High-performance processing of large data-sets requires parallel execution of partitioned
data across distributed computing nodes. This facility should be enabled with suitable data
processing programming models and other supporting file systems.
❖ Several programming models have been developed for high-performance processing of large
data-sets. Among them, Google’s MapReduce is a well-accepted model for processing massive
amounts of unstructured data in parallel across a distributed processing environment.
❖ Several other models have emerged influenced by the MapReduce model. Among them
Hadoop, Pig and Hive are a few to mention.
❖ Among the various file systems to support high-performance processing of data, Google File
System (GFS) is considered as the pioneer. The open-source Hadoop Distributed File System
(HDFS) is inspired by GFS.236
Cloud Computing
❖ Storage in cloud is delivered in two categories: for general users and for developers. Storage
for general users are delivered as SaaS and for the developers it is delivered as IaaS.
❖ For general users, cloud provides ready-to-use storage which is usually managed by the
providers. Hence, users can directly use the storage without worrying about any kind of
processing of the storage. Such storages are known as ‘unmanaged’ storage type.
❖ Managed storages are raw storages which are built to be managed by the users themselves.
Computing developers use such kind of storages.