Verkauf durch Sack Fachmedien

Singh

Beginning Hadoop

Understanding Hadoop Scalability and Performance of Clusters

Medium: Buch
ISBN: 978-1-4842-1354-4
Verlag: Apress
Erscheinungstermin: 08.04.2016
Lieferfrist: bis zu 10 Tage
There are many challenges in setting up and scaling distributed frameworks like hadoop.

Despite, Hadoop being an Open Source product and with so many good documentations and books, it is difficult for an individual or an enterprise to define various use cases or working models, that too with a clear understanding of its workings and tuning it for optimal performance.

Pro Hadoop Administration by Gurmukh Singh, a Hadoop specialist and an infrastructure architect, takes a deep dive into configuring Hadoop services and its integration with various tools or frameworks. The book covers the processes right from scratch to building a Hadoop cluster at the production level, with best practices and optimal performance.

You will learn:

- Use Cases and set of recipes for the Hadoop production environment.
- From Compiling Hadoop to setting up Cluster with Highly available services.
- It’s integration with various tools like Sqoop, Flume, HBase, Hive and many more.
- Performance tuning and Cluster Planning.
- Hadoop security like Kerberos, Encryption and other aspects of security like OS and Network Level.

Produkteigenschaften


  • Artikelnummer: 9781484213544
  • Medium: Buch
  • ISBN: 978-1-4842-1354-4
  • Verlag: Apress
  • Erscheinungstermin: 08.04.2016
  • Sprache(n): Englisch
  • Auflage: 1., st Auflage 2016
  • Produktform: Kartoniert, PB
  • Seiten: 250
  • Format (B x H): 178 x 254 mm
  • Ausgabetyp: Kein, Unbekannt

Autoren/Hrsg.

Autoren

Singh, Gurmukh

Chapter 1: Introduction to Distributed Computing and Hadoop.
Chapter Goal: Talk about the Distributed computing, challenges and some of the existing platforms in the market.

Sub -Topics
- Introduction to Distributed computing.
- Introduction to Hadoop and its history
- Current Hadoop distributions and its market.
- Problem statement why Hadoop is needed and its use cases

Chapter 2: Hadoop as a Platform
Chapter Goal: Install and configure Hadoop basic Services

Sub - Topics
- Hadoop Compilation.
- Hadoop Installation and its various modes
- Hadoop Daemons Configuration.
- Basic Hadoop Configuration Parameters.
Chapter 3: Hadoop Daemons and Services
Chapter Goal: Setup Hadoop Secondary namenode and its purpose.

Sub - Topics:
- Secondary NameNode Setup.
- Namenode Metadata Concepts.

3. Recovery from Secondary namenode

4. Failover to Secondary

Chapter 4: Concepts of redundancy and Data Access
Chapter Goal: Understand how replication works and setup rack awareness

Sub - Topics:
- Configure Hadoop Clients
- Multi-A record Clients -
-

4. Disk Storage Concepts.

Chapter 4: Hadoop Administration Tasks
Chapter Goal: Learn about day-to-day activities, which are performed by Hadoo
p Admins like Cluster balancing, disk space issues etc

- Hadoop Cluster balancing.
- Cluster Membership.
- Adding Disks to Data Nodes
- NameNode Metadata Operations
- Trash Space Configuration

Chapter 5: User Quota Management and Schedulers
Chapter Goal: Learn about User management and Space Quota etc
- User Management.
- Space Quota Management.
- Job Schedulers
- Queue setup and management.
- ACL’s for Queues.

Chapter 6: Hadoop 2.x and YARN Configuration
Chapter Goal: Learn about Hadoop 2.x features and YARN framework.
- Introduction to Hadoop 2.x.
- Hadoop 2.x features.
- Introduction to YARN and its components.
- Installation and Configuration of YARN.
- Setup Job Queues.

Chapter 7: Making Services High Available
Chapter Goal: Learn about High Availability for Namenode and Resource Manager.
- Namenode HA using Shared Storage.
- Namenode HA using QJM.
- Resource Manager HA.

Chapter 8: Data Ingestion using HIVE, PIG, SQOOP, FLUME
Chapter Goal: Learn about Hive, sqoop, and flume for data ingestion.
- Introduction to Data Ingestion.
- Introduction to PIG and its installation.
- Introduction to Hive and its installation.
- Introduction to SQOOP and its installation.
- Introduction to Flume and its installation.
- Examples for Data Ingestion.
Chapter 9: Database for the Hadoop Platform.
Chapter Goal: Learn about HBase and its integration with other tools of Hadoop.
- Introduction to HBase.
- HBase Installation.
- HBase with Hive
- Im
porting Data from HBase - Phoneix with Hbase

Chapter 10: Hadoop Security.
Chapter Goal: Learn about securing Hadoop with Kerberos and other tools.
- Introduction to Kerberos.
- Installation and Configuring Kerberos.
- Hadoop with Kerberos.
- Securing Hadoop at the OS level.

Chapter 11: Hadoop Cluster Planning and performance.
Chapter Goal: Learn about Cluster planning and performance tuning and other tools.
- Hadoop Cluster Planning.
- Map Reduce Phases.
- Performance tuning.
- Hadoop Benchmarking.

Chapter 12: Hadoop Advanced Features.
Chapter Goal: Learn about Federation, NFS, webHDFS.
- Introduction to Hadoop Federation.
- Setup Hadoop Federation.
- Introduction to Snapshots and its configuration.
- NFSv3 configuration for Hadoop.
- WebHDFS for REST API calls.