No products in the cart.
Aneka is an outcome of efforts in the field of distributed and grid computing from the University
of Melbourne, Australia. It offers PaaS facility in cloud computing. ‘Aneka’ is a Sanskrit term
meaning ‘many in one’. The uniqueness of the solution is its support for multiple programming
models like task programming, thread programming and MapReduce programming. This
section discusses these programming models with references to their implementations by
Aneka cloud.
Each of the three programming models supported by Aneka has three main elements as
‘Executors’, ‘Schedulers’ and ‘WorkUnits’. Apart from this there is a ‘Manager’. The ‘WorkUnit’
is a logical entity that defines the size or unit of an executable module that can be handled by
Aneka. Figure 19.9 is going to explain the component structure of Aneka execution model.
The ‘Scheduler’ arranges the execution of work units comprising an application, distributes
them to multiple executing nodes (‘Executors’), receives the result and sends it to users.
The ‘Manager’ is a client component that communicates with the Aneka system on behalf of
the client system.
Aneka has been commercialized by Manjrasoft, an Australian company which was formed
to commercialize the grid and cloud computing software technologies developed at research
lab of University of Melbourne. Manjrasoft first released the beta version of Aneka in 2009.
Aneka, which means ‘many-in-one’, is named so because it supports multiple programming
models like task programming, thread programming and MapReduce programming.
Aneka is built over Microsoft .NET framework and provides run-time and set of API for
developing .NET applications over it. Aneka works as a workload management and distribution
platform that empowers (speeds up) applications over Microsoft .NET framework environment.
Aneka has special ability of being deployed on third-party IaaS as it supports provisioning
of resources on public cloud like Amazon EC2, Microsoft Azure, GoGrid and several other
private clouds. This helps in building hybrid application with very minimal programming
efforts.
Aneka basically comprises two key components:
â– Tools for rapid application development and software development kit (SDK) containing
application program interfaces (APIs) and
â– A run-time engine to manage deployment and execution of applications.
The following section discusses about three programming models supported by Aneka in brief
19.14.1 Thread Programming
High performance computation system focuses on delivering better output during computation.
High throughput in computation is achieved by allowing concurrency through multi-processing
and multi-threading. A process represents a program in execution. In multi-processing, multiple
processes are executed in parallel at a single machine. Such system is meant to support multi
tasking. On the other hand, a thread represents a single flow of control within a process. A system
supports multi-threading when it can execute different threads in parallel within a process.19.14.1.1 Multi-threading in Aneka
For high-end requirements, performance of executing multi-threaded applications on a
single multi-core system (systems having two or more processing units, known as cores and
which is generally attached as a single component) becomes insufficient. In such cases, the
distributed execution of application is the only solution. For this purpose, an application can
be decomposed into several units.
In Aneka, multi-thread programming is implemented over cloud, using Thread Programming
Model. In this model, threads are treated as distributed threads being known as Aneka thread.
Aneka threads follow the principle of local threads which can be executed over distributed
system architecture. Aneka schedules the executions of threads efficiently while creation and
control of the threads is the responsibility of the application developer.
APIs for Aneka thread programming imitate the .NET-based thread class library. Hence it
becomes effortless to port .NET-based multi-threaded application on Aneka as the transition
between a .NET thread and an Aneka thread is almost transparent. .NET applications need not
be fully rewritten to be ported in Aneka platform, rather only a replacement of the class System.
Threading.Thread by AnekaApplication does the trick.
In Aneka Thread Programming model, the work units are represented as Aneka
threads. Programmatically, the concept is implemented by using the template class
AnekaApplication<AnekaThread and ThreadManager>. The ‘AnekaApplication’ class type
Business Applications
Aneka PaaS
Support for multiple programming models: Task, Thread, MapReduce
Public Cloud IaaS
Private Cloud IaaS
FIG 19.10: Aneka PaaS model345
Cloud Management and a Programming Model Case Study
is used for all distributed applications using Thread Programming model in Aneka. A
Configuration class also defines the application’s interaction with the cloud middleware.
In Aneka Thread Programming model, an application is treated as a collection of threads called
Aneka threads which can be executed remotely over distributed environments.
19.14.2 Task Programming
Thread programming provides parallelism in execution and can run in a single system having
a distributed system architecture. Many such programming models are available which tie the
power of multiple computing systems together. Task programming model was designed to be
executed over clusters and architectural distribution is inherent in it. This programming model
provides an attractive solution for executing high-performance distributed applications.
In task programming model, any application is considered as collection of tasks which are
independent of each other and can be executed in any order. Task is defined by every operating
system in its design. All the present-day OSs support multi-tasking activity where multiple
tasks can be executed concurrently.
A task is a combination of one or more programs constituting a computing unit of
application. The computing unit must represent a component of the application that can be
executed independently in isolation. Additionally, an application is a collection of multiple
tasks. Task generally takes input files and produces output file(s) as outcomes.
Depending on various characteristics and requirements of applications, task computing can
be segregated in two primary categories: High-Performance Computing and High-Throughput
Computing. Each category has some specific infrastructural requirements.
High-Performance Computing (HPC) is the use of task programming model for executing
applications needing high computing power over a relatively shorter period of time. HPC
combines tasks in tightly-coupled manner and hence requires very low latency in network
communication to minimize data exchange time. Thus low-latency network is a requirement
for HPC model. Traditionally, the clusters are designed to support HPC applications.
High-Throughput Computing (HTC) is the use of task programming model for executing
applications which needs the high computing power over a longer period of time. HTC
applications generally constitutes of large number of independent tasks which run for
long time (for several weeks or months). Such tasks need not to communicate during
execution and can be easily scheduled over a distributed system architecture. Traditionally,
the computing grids which are composed of heterogeneous resources supports HTC
applications very well.
There is another category in task computing called as Many Task Computing (MTC) that
combines HPC and HTC both. Tasks under MTC model are loosely coupled but communication
intensive. Cloud infrastructural model is most suitable to support MTC.
19.14.2.1 Task Programming in Aneka
Aneka task programming model offers the support for developing distributed applications over
Aneka platform without any difficulty. Aneka tasks are implemented through APIs which are
packed into ‘ITask’ (Aneka.Tasks.ITask) interface. Tasks created at local nodes can be passed
over to Aneka cloud where supports for execution of Aneka tasks are implemented. Figure 19.11
describes the scenario.
The ‘AnekaApplication’ class which is specialized for handling tasks bundles together the
tasks created through ‘ITask’ interface and all of their dependencies (like library and data files).
The other client side component ‘TaskManager’ warps and represents the tasks to the cloud,
‘AnekaTask’ submits tasks to Aneka cloud, monitors execution and accepts the returned result.
In Aneka cloud, there are four services which coordinate the entire task execution activities,
namely as ‘MembershipCatalogue’, ‘TaskScheduler’, ‘ExecutionService’, and ‘StorageService’.
Among these, ‘TaskScheduler’ schedules the execution of tasks among the resources.
Tasks are created by developers using Aneka-supplied interface and classes and then handed
over to Aneka cloud for execution.
19.14.3 MapReduce Programming
Several applications today produce huge volume of data that need to be stored and processed
efficiently. This is a challenging task and is known as data-intensive computing. In cloud
computing environment, data-intensive computing happens in many domains for business
analysis or scientific simulation purposes.
MapReduce is a programming model introduced by Google to process large volume of
data. It works by representing the computational task using two functions as map and reduce.
Underlying storage infrastructure is of distributed nature in this model and data is generally
presented as a key-value pair.
The job of ‘Map’ function is to filter and sort data into queues. For example, name of
students can be sorted using their surnames. One queue is maintained for each distinct
surname. ‘Reduce’ function performs summary operation. For example, a summary may show
number of students under every different surname. Hence it can be observed that, MapReduce
operation takes key-value pair as input and produces lists.
19.14.3.1 MapReduce Programming in Aneka
MapReduce operation executes in two phases. First, multiple ‘Map’ operations run in parallel
independently. In second phase, ‘Reduce’ operations operate on output produced in first phase.
MapReduce in Aneka has been implemented following its implementation in Hadoop (a Java
based open-source programming framework). Figure 19.12 represents the model in Aneka.
The Mapper and Reducer correspond to the class implementations of map and reduce
operations respectively. Mapper and Reducer class are extended from Aneka MapReduce API.
The run-time implementation comprises of three modules as
â– A supporting distributed file system,
â– MapReduce Scheduling module and
â– MapReduce Execution Module.
Local data files from client’s MapReduce application is submitted along with MapReduce job.
Thereafter, the process remains transparent and output is returned to client.
Aneka PaaS consumers have the options of developing applications following Thread model,
Task model or MapReduce model, as Aneka supports all of them.
Aneka is an outcome of efforts in the field of distributed and grid computing from the University
of Melbourne, Australia. It offers PaaS facility in cloud computing. ‘Aneka’ is a Sanskrit term
meaning ‘many in one’. The uniqueness of the solution is its support for multiple programming
models like task programming, thread programming and MapReduce programming. This
section discusses these programming models with references to their implementations by
Aneka cloud.
Each of the three programming models supported by Aneka has three main elements as
‘Executors’, ‘Schedulers’ and ‘WorkUnits’. Apart from this there is a ‘Manager’. The ‘WorkUnit’
is a logical entity that defines the size or unit of an executable module that can be handled by
Aneka. Figure 19.9 is going to explain the component structure of Aneka execution model.
The ‘Scheduler’ arranges the execution of work units comprising an application, distributes
them to multiple executing nodes (‘Executors’), receives the result and sends it to users.
The ‘Manager’ is a client component that communicates with the Aneka system on behalf of
the client system.
Aneka has been commercialized by Manjrasoft, an Australian company which was formed
to commercialize the grid and cloud computing software technologies developed at research
lab of University of Melbourne. Manjrasoft first released the beta version of Aneka in 2009.
Aneka, which means ‘many-in-one’, is named so because it supports multiple programming
models like task programming, thread programming and MapReduce programming.
Aneka is built over Microsoft .NET framework and provides run-time and set of API for
developing .NET applications over it. Aneka works as a workload management and distribution
platform that empowers (speeds up) applications over Microsoft .NET framework environment.
Aneka has special ability of being deployed on third-party IaaS as it supports provisioning
of resources on public cloud like Amazon EC2, Microsoft Azure, GoGrid and several other
private clouds. This helps in building hybrid application with very minimal programming
efforts.
Aneka basically comprises two key components:
â– Tools for rapid application development and software development kit (SDK) containing
application program interfaces (APIs) and
â– A run-time engine to manage deployment and execution of applications.
The following section discusses about three programming models supported by Aneka in brief
19.14.1 Thread Programming
High performance computation system focuses on delivering better output during computation.
High throughput in computation is achieved by allowing concurrency through multi-processing
and multi-threading. A process represents a program in execution. In multi-processing, multiple
processes are executed in parallel at a single machine. Such system is meant to support multi
tasking. On the other hand, a thread represents a single flow of control within a process. A system
supports multi-threading when it can execute different threads in parallel within a process.19.14.1.1 Multi-threading in Aneka
For high-end requirements, performance of executing multi-threaded applications on a
single multi-core system (systems having two or more processing units, known as cores and
which is generally attached as a single component) becomes insufficient. In such cases, the
distributed execution of application is the only solution. For this purpose, an application can
be decomposed into several units.
In Aneka, multi-thread programming is implemented over cloud, using Thread Programming
Model. In this model, threads are treated as distributed threads being known as Aneka thread.
Aneka threads follow the principle of local threads which can be executed over distributed
system architecture. Aneka schedules the executions of threads efficiently while creation and
control of the threads is the responsibility of the application developer.
APIs for Aneka thread programming imitate the .NET-based thread class library. Hence it
becomes effortless to port .NET-based multi-threaded application on Aneka as the transition
between a .NET thread and an Aneka thread is almost transparent. .NET applications need not
be fully rewritten to be ported in Aneka platform, rather only a replacement of the class System.
Threading.Thread by AnekaApplication does the trick.
In Aneka Thread Programming model, the work units are represented as Aneka
threads. Programmatically, the concept is implemented by using the template class
AnekaApplication<AnekaThread and ThreadManager>. The ‘AnekaApplication’ class type
Business Applications
Aneka PaaS
Support for multiple programming models: Task, Thread, MapReduce
Public Cloud IaaS
Private Cloud IaaS
FIG 19.10: Aneka PaaS model345
Cloud Management and a Programming Model Case Study
is used for all distributed applications using Thread Programming model in Aneka. A
Configuration class also defines the application’s interaction with the cloud middleware.
In Aneka Thread Programming model, an application is treated as a collection of threads called
Aneka threads which can be executed remotely over distributed environments.
19.14.2 Task Programming
Thread programming provides parallelism in execution and can run in a single system having
a distributed system architecture. Many such programming models are available which tie the
power of multiple computing systems together. Task programming model was designed to be
executed over clusters and architectural distribution is inherent in it. This programming model
provides an attractive solution for executing high-performance distributed applications.
In task programming model, any application is considered as collection of tasks which are
independent of each other and can be executed in any order. Task is defined by every operating
system in its design. All the present-day OSs support multi-tasking activity where multiple
tasks can be executed concurrently.
A task is a combination of one or more programs constituting a computing unit of
application. The computing unit must represent a component of the application that can be
executed independently in isolation. Additionally, an application is a collection of multiple
tasks. Task generally takes input files and produces output file(s) as outcomes.
Depending on various characteristics and requirements of applications, task computing can
be segregated in two primary categories: High-Performance Computing and High-Throughput
Computing. Each category has some specific infrastructural requirements.
High-Performance Computing (HPC) is the use of task programming model for executing
applications needing high computing power over a relatively shorter period of time. HPC
combines tasks in tightly-coupled manner and hence requires very low latency in network
communication to minimize data exchange time. Thus low-latency network is a requirement
for HPC model. Traditionally, the clusters are designed to support HPC applications.
High-Throughput Computing (HTC) is the use of task programming model for executing
applications which needs the high computing power over a longer period of time. HTC
applications generally constitutes of large number of independent tasks which run for
long time (for several weeks or months). Such tasks need not to communicate during
execution and can be easily scheduled over a distributed system architecture. Traditionally,
the computing grids which are composed of heterogeneous resources supports HTC
applications very well.
There is another category in task computing called as Many Task Computing (MTC) that
combines HPC and HTC both. Tasks under MTC model are loosely coupled but communication
intensive. Cloud infrastructural model is most suitable to support MTC.
19.14.2.1 Task Programming in Aneka
Aneka task programming model offers the support for developing distributed applications over
Aneka platform without any difficulty. Aneka tasks are implemented through APIs which are
packed into ‘ITask’ (Aneka.Tasks.ITask) interface. Tasks created at local nodes can be passed
over to Aneka cloud where supports for execution of Aneka tasks are implemented. Figure 19.11
describes the scenario.
The ‘AnekaApplication’ class which is specialized for handling tasks bundles together the
tasks created through ‘ITask’ interface and all of their dependencies (like library and data files).
The other client side component ‘TaskManager’ warps and represents the tasks to the cloud,
‘AnekaTask’ submits tasks to Aneka cloud, monitors execution and accepts the returned result.
In Aneka cloud, there are four services which coordinate the entire task execution activities,
namely as ‘MembershipCatalogue’, ‘TaskScheduler’, ‘ExecutionService’, and ‘StorageService’.
Among these, ‘TaskScheduler’ schedules the execution of tasks among the resources.
Tasks are created by developers using Aneka-supplied interface and classes and then handed
over to Aneka cloud for execution.
19.14.3 MapReduce Programming
Several applications today produce huge volume of data that need to be stored and processed
efficiently. This is a challenging task and is known as data-intensive computing. In cloud
computing environment, data-intensive computing happens in many domains for business
analysis or scientific simulation purposes.
MapReduce is a programming model introduced by Google to process large volume of
data. It works by representing the computational task using two functions as map and reduce.
Underlying storage infrastructure is of distributed nature in this model and data is generally
presented as a key-value pair.
The job of ‘Map’ function is to filter and sort data into queues. For example, name of
students can be sorted using their surnames. One queue is maintained for each distinct
surname. ‘Reduce’ function performs summary operation. For example, a summary may show
number of students under every different surname. Hence it can be observed that, MapReduce
operation takes key-value pair as input and produces lists.
19.14.3.1 MapReduce Programming in Aneka
MapReduce operation executes in two phases. First, multiple ‘Map’ operations run in parallel
independently. In second phase, ‘Reduce’ operations operate on output produced in first phase.
MapReduce in Aneka has been implemented following its implementation in Hadoop (a Java
based open-source programming framework). Figure 19.12 represents the model in Aneka.
The Mapper and Reducer correspond to the class implementations of map and reduce
operations respectively. Mapper and Reducer class are extended from Aneka MapReduce API.
The run-time implementation comprises of three modules as
â– A supporting distributed file system,
â– MapReduce Scheduling module and
â– MapReduce Execution Module.
Local data files from client’s MapReduce application is submitted along with MapReduce job.
Thereafter, the process remains transparent and output is returned to client.
Aneka PaaS consumers have the options of developing applications following Thread model,
Task model or MapReduce model, as Aneka supports all of them.