Bøger / faglitteratur

Learning big data with Amazon Elastic MapReduce : easily learn, build, and execute real-world big data solutions using Hadoop and AWS EMR (engelsk)

Del af Professional expertise distilled

Amar Kant Singh

Detaljer

...

Beskrivelse

Summary: This book is aimed at developers and system administrators who want to learn about Big Data analysis using Amazon Elastic MapReduce. Basic Java programming knowledge is required. You should be comfortable with using command-line tools. Prior knowledge of AWS, API, and CLI tools is not assumed. Also, no exposure to Hadoop and MapReduce is expected.

Indhold

Seneste udgave, e-bog

Cover; Copyright; Credits; About the Authors; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Amazon Web Services; What is Amazon Web Services?; Structure and Design; Regions; Availability Zones; Services provided by AWS; Compute; Amazon EC2; Auto Scaling; Elastic Load Balancing; Amazon Workspaces; Storage; Amazon S3; Amazon EBS; Amazon Glacier; AWS Storage Gateway; AWS Import/Export; Databases; Amazon RDS; Amazon DynamoDB; Amazon Redshift; Amazon ElastiCache; Networking and CDN; Amazon VPC; Amazon Route 53; Amazon CloudFront; AWS Direct Connect ; AnalyticsAmazon EMR; Amazon Kinesis; AWS Data Pipeline; Application services; Amazon CloudSearch (Beta); Amazon SQS; Amazon SNS; Amazon SES; Amazon AppStream; Amazon Elastic Transcoder; Amazon SWF; Deployment and Management; AWS Identity and Access Management; Amazon CloudWatch; AWS Elastic Beanstalk; AWS CloudFormation; AWS OpsWorks; AWS CloudHSM; AWS CloudTrail; AWS Pricing; Creating an account on AWS; Step 1 - Creating an Amazon.com account; Step 2 - Providing a payment method; Step 3 - Identity verification by telephone; Step 4 - Selecting the AWS support plan ; Launching the AWS management consoleGetting started with Amazon EC2; How to start a machine on AWS?; Step 1 - Choosing an Amazon Machine Image; Step 2 - Choosing an instance type; Step 3 - Configuring instance details; Step 4 - Adding storage; Step 5 - Tagging your instance; Step 6 - Configuring a security group; Communicating with the launched instance; EC2 instance types; General purpose; Memory optimized; Compute optimized; Getting started with Amazon S3; Creating a S3 bucket; Bucket naming; S3cmd; Summary; Chapter 2: MapReduce; The map function; The reduce function; Divide and conquer ; What is MapReduce?The map reduce function models; The map function model; The reduce function model; Data life cycle in the MapReduce framework; Creation of input data splits; Record reader; Mapper; Combiner; Partitioner; Shuffle and sort; Reducer; Real-world examples and use cases of MapReduce; Social networks ; Media and entertainment; E-commerce and websites; Fraud detection and financial analytics; Search engines and ad networks; ETL and data analytics; Software distributions built on the MapReduce framework; Apache Hadoop; MapR; Cloudera distribution; Summary; Chapter 3: Apache Hadoop ; What is Apache Hadoop?Hadoop modules; Hadoop Distributed File System; Major architectural goals of HDFS; Block replication and rack awareness; The HDFS architecture; NameNode; DataNode; Apache Hadoop MapReduce; Hadoop MapReduce 1.x; JobTracker; TaskTracker; Hadoop MapReduce 2.0; Hadoop YARN; Apache Hadoop as a platform; Apache Pig; Apache Hive; Summary; Chapter 4: Amazon EMR - Hadoop on Amazon Web Services; What is AWS EMR?; Features of EMR; Accessing Amazon EMR features; Programming on AWS EMR; The EMR architecture; Types of nodes; EMR Job Flow and Steps; Job Steps; An EMR cluster ; Hadoop filesystem on EMR - S3 and HDFS

Tidsskrift

Artiklen er en del af

lorem ipsum dolor sit amet ...

Tidsskrift

Artiklerne i handler ofte om

Artikler med samme emner

Fra

Artikler

Alle registrerede artikler fordelt på udgivelser

...

Professional expertise distilled

Gå til serien

IBM Lotus Sametime 8 essentials : a user's guide : mastering online enterprise communication with this collaborative software

Marie L. Scott

Java EE 7 developer handbook : develop professional applications in Java EE 7 with this essential reference guide

Peter A. Pilgrim

Developing web applications with Oracle ADF Essentials : quickly build attractive, user-friendly web applications using Oracle's free ADF Essentials toolkit

Sten E. Vesterli

Getting started with SQL Server 2014 administration : optimize your database server to be fast, efficient, and highly secure using the brand new features of SQL Server 2014

Gethryn Ellis

AWS development essentials : design and build flexible, highly scalable, and cost-effective applications using Amazon web services

Prabhakaran Kuppusamy

Learning big data with Amazon Elastic MapReduce : easily learn, build, and execute real-world big data solutions using Hadoop and AWS EMR

Amar Kant Singh

Mastering QlikView : unleash the power of QlikView and Qlik Sense to make optimum use of data for Business Intelligence

Stephen Redmond

Informationer og udgaver

2014

E-bogAmar Kant SinghPackt Publishing, 1. edition