Image for post: Hadoop overview

Hadoop overview

I've spent time with the Hortonworks sandbox and reading about the Hadoop project. It's funny and you can learn a lot fast but I think there's a lot of confusion when you try to face Hadoop for the first time. Too many projects, concepts, and topics. And the bad news is you cannot use easily it in a real world unless you work with big data. But don't worry. If you want to explore and learn about Hadoop you can try the Hortonworks virtual machine, read books, articles and more. Trying this technology on your computer is the first step to know how Hadoop works. Before start, we have to study the infrastructure and we can get an overview of the main projects.

From hardware to software

The HDFS is the Hadoop file system more closely to the hardware infrastructure. On the following list, there are the most important terms and concepts you must know to understand the Hadoop architecture.

Software project are many and they seem very complex

Hadoop is composed of a set of projects. Every project can have relations to others. Let's see all main projects names and their role on a Hadoop architecture.

Other related projects