Here are just a few sample projects.
Using MapReduce: MapReduce
is a a software framework introduced by Google to support
distributed computing on large data sets on clusters of
computers. The framework is inspired by map and reduction
functions
commonly used in functional programming. Amazon has a
service called Amazon Elastic MapReduce.that uses the Hadoop
framework. One of the goals of the Hadoop framework is to
provide an open source implementation of MapReduce that
can run on a cluster or on rented hardware, e.g., Amazon EC2 cluster.
Examples include the
following:
- Analysis of a large data set e.g., a set of logfiles. An example of this can be found here. Other sets of documents include blogs and news articles. Some datasets are kept by Yahoo through its Webscope program and Amazon.
- Develop and run a scientific application on Amazon EC2 cloud. Describe your experience.
Dynamic Resource Allocation using Amazon CloudWatch: Manage
a cluster using Amazon CloudWatch. Define sets of policies
representing different objectives and evaluate the policies.
Develop a Local Cloud: Eucalyptus (or the Elastic Utility Computing Architecture ) is an open-source software
infrastructure for implementing "cloud computing" on clusters. Create a local cloud using Departmental resources.