Hadoop: Difference between revisions

Revision as of 16:43, 16 September 2011

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. "Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework." Hadoop Overview</ref> It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Googles's MapReduce and Google File System (GFS) papers.

Revision as of 16:42, 16 September 2011 (view source) Jpr@uab.edu (talk \| contribs) (Create intro using wikipedia:Hadoop page intro)		Revision as of 16:43, 16 September 2011 (view source) Jpr@uab.edu (talk \| contribs) (fix wikipedia interwiki links) Newer edit →
Line 1:		Line 1:
	'''Apache Hadoop''' is a [[wikipedia:software framework]] that supports data-intensive [[wikipedia:distributed computing\|distributed applications]] under a [[wikipedia:Free software\|free license]]. "Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework." [http://wiki.apache.org/hadoop/ProjectDescription Hadoop Overview]</ref> It enables applications to work with thousands of nodes and [[wikipedia:petabytes]] of data. Hadoop was inspired by [[wikipedia:Google]]'s [[wikipedia:MapReduce]] and [[wikipedia:GoogleFS\|Google File System]] (GFS) papers.		'''Apache Hadoop''' is a [[wikipedia:software framework\|software framework]] that supports data-intensive [[wikipedia:distributed computing\|distributed applications]] under a [[wikipedia:Free software\|free license]]. "Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework." [http://wiki.apache.org/hadoop/ProjectDescription Hadoop Overview]</ref> It enables applications to work with thousands of nodes and [[wikipedia:petabytes\|petabytes]] of data. Hadoop was inspired by [[wikipedia:Google\|Googles]]'s [[wikipedia:MapReduce\|MapReduce]] and [[wikipedia:GoogleFS\|Google File System]] (GFS) papers.

Hadoop: Difference between revisions

Revision as of 16:43, 16 September 2011

Navigation menu

Search