Skip to content

Keyword search

Guided search

Click a term to initiate a search.

Content type

Service Category

Amazon Web Services

Syndicate content
Amazon Web Services, Products, Tools, and Developer Information...
Updated: 55 min 1 sec ago

Amazon CloudFront: Looking Back, Looking Forward, Making Plans

Thu, 02/02/2012 - 22:59

Looking Back
In 2011 we added a total of seven edge locations to Amazon CloudFront and Route 53. We also added lots of new features, as I documented last year.

Looking Forward
Our newest edge locations are located in Milan, Italy and Osaka, Japan. This brings our total worldwide location count to 26 (see the CloudFront page for a complete list). Each new edge location helps lower latency and improves performance for your end users.

Making Plans
We have additional locations in the pipeline for 2012 and beyond. Our planning process takes a number of factors in to account including notes from our sales team and discussions on the Amazon CloudFront forum. We also collect latency measurements from a number of points around the globe to our current set of locations and correlate them with broadband Internet penetration and existing Amazon CloudFront usage in the area.

I would also like to invite you to participate in the Amazon CloudFront Edge Location Survey. We are very interested in your suggestions for additional locations. We'd also like to learn a bit more about the type of content that you deliver to your customers.

All Aboard
The CloudFront team is hiring. We need some Software Development Engineers, a Senior Systems Engineer,a Senior Software Development Manager,  a Product Manager, and a Business Development Representative.

-- Jeff;

 

Categories: Vendor News

New Elastic MapReduce Features: Metrics, Updates, VPC, and Cluster Compute Support (Guest Post)

Tue, 01/31/2012 - 18:52

Today's guest blogger is Adam Gray. Adam is a Product Manager on the Elastic MapReduce Team.

-- Jeff;

We’re always excited when we can bring features to our customers that make it easier for them to derive value from their data—so it’s been a fun month for the EMR team. Here is a sampling of the things we’ve been working on.

Free CloudWatch Metrics
Starting today customers can view graphs of 23 job flow metrics within the EMR Console by selecting the Monitoring tab in the Job Flow Details page. These metrics are pushed CloudWatch every five minutes at no cost to you and include information on:

  • Job flow progress including metrics on the number of map and reduce tasks running and remaining in your job flow and the number of bytes read and written to S3 and HDFS.
  • Job flow contention including metrics on HDFS utilization, map and reduce slots open, jobs running, and the ratio between map tasks remaining and map slots.
  • Job flow health including metrics on whether your job flow is idle, if there are missing data blocks, and if there are any dead nodes.

Please watch this video to see how to view CloudWatch graphs in the EMR Console:

You can also learn more from the Viewing CloudWatch Metrics section of the EMR Developer Guide.

You can view the new metrics in the AWS Management Console:

Further, through the CloudWatch Console, API, or SDK you can set alarms to be notified via SNS if any of these metrics go outside of specified thresholds. For example, you can receive an email notification whenever a job flow is idle for more than 30 minutes, HDFS Utilization goes above 80%, or there are five times as many remaining map tasks as there are map slots, indicating that you may want to expand your cluster size.

Please watch this video to see how to set EMR alarms through the CloudWatch Console:

Hadoop 0.20.205, Pig 0.9.1, and AMI Versioning
EMR now supports running your job flows using Hadoop 0.20.205 and Pig 0.9.1. To simplify the upgrade process, we have also introduced the concept of AMI versions. You can now provide a specific AMI version to use at job flow launch or specify that you would like to use our “latest” AMI, ensuring that you are always using our most up-to-date features. The following AMI versions are now available:

  • Version 2.0.x: Hadoop 0.20.205, Hive 0.7.1, Pig 0.9.1, Debian 6.0.2 (Squeeze)
  • Version 1.0.x: Hadoop 0.18.3 and 0.20.2, Hive 0.5 and 0.7.1, Pig 0.3 and 0.6, Debian 5.0 (Lenny)

You can specify an AMI version when launching a job flow in the Ruby CLI using the --ami-version argument (note that you will have to download the latest version of the Ruby CLI):

$ ./elastic-mapreduce --create --alive --name "Test AMI Versioning" --ami-version latest --num-instances 5 --instance-type m1.small

Please visit the AMI Versioning section of the Elastic MapReduce Developer Guide for more information.

S3DistCp for Efficient Copy between S3 and HDFS
We have also made available S3DistCp, an extension of the open source Apache DistCp tool for distributed data copy, that has been optimized to work with Amazon S3. Using S3DistCp, you can efficiently copy large amounts of data between Amazon S3 and HDFS on your Amazon EMR job flow or copy files between Amazon S3 buckets. During data copy you can also optimize your files for Hadoop processing. This includes modifying compression schemes, concatenating small files, and creating partitions.

For example, you can load Amazon CloudFront logs from S3 into HDFS for processing while simultaneously modifying the compression format from Gzip (the Amazon CloudFront default) to LZO and combining all the logs for a given hour into a single file. As Hadoop jobs are more efficient processing a few, large, LZO-compressed files than processing many, small, Gzip-compressed files, this can improve performance significantly.

Please see Distributed Copy Using S3DistCp in the Amazon Elastic MapReduce documentation for more details and code examples.

cc2.8xlarge Support
Amazon Elastic MapReduce also now supports the new Amazon EC2 Cluster Compute instance, Cluster Compute Eight Extra Large (cc2.8xlarge). Like other Cluster Compute instances, cc2.8xlarge instances are optimized for high performance computing, giving customers very high CPU capabilities and the ability to launch instances within a high bandwidth, low latency, full bisection bandwidth network. cc2.8xlarge instances provide customers with more than 2.5 times the CPU performance of the first Cluster Compute instance (cc1.4xlarge) instance, more memory, and more local storage at a very compelling cost. Please visit the Instance Types section of the Amazon Elastic MapReduce detail page for more details.

In addition, we are pleased to announce an 18% reduction in Amazon Elastic MapReduce pricing for cc1.4xlarge instances, dropping the total per hour cost to $1.57. Please visit the Amazon Elastic MapReduce Pricing Page for more details.

VPC Support
Finally, we are excited to announce support for running job flows in an Amazon Virtual Private Cloud (Amazon VPC), making it easier for customers to:

  • Process sensitive data - Launching a job flow on Amazon VPC is similar to launching the job flow on a private network and provides additional tools, such as routing tables and Network ACLs, for defining who has access to the network. If you are processing sensitive data in your job flow, you may find these additional access control tools useful.
  • Access resources on an internal network - If your data is located on a private network, it may be impractical or undesirable to regularly upload that data into AWS for import into Amazon Elastic MapReduce, either because of the volume of data or because of its sensitive nature. Now you can launch your job flow on an Amazon VPC and connect to your data center directly through a VPN connection.

You can launch Amazon Elastic MapReduce job flows into your VPC through the Ruby CLI by using the --subnet argument and specifying the subnet address (note that you will have to download the latest version of the Ruby CLI):

$ ./elastic-mapreduce --create --alive --subnet "subnet-identifier"

Please visit the Running Job Flows on an Amazon VPC section in the Elastic MapReduce Developer Guide for more information.

-- Adam Gray, Product Manager, Amazon Elastic MapReduce.

Categories: Vendor News

Amazon S3 Growth for 2011 - Now 762 Billion Objects

Mon, 01/30/2012 - 22:15

As of the end of 2011, there are 762 billion (762,000,000,000) objects in Amazon S3. We process over 500,000 requests per second for these objects at peak times.

Here's the annual growth chart:

This represents year-over-year growth of 192%; S3 grew faster last year than it did in any year since it launched in 2006.

Where are all of these objects coming from? Although we definitely made it easier for you to delete objects using Multi-Object Deletion and Object Expiration, we also gave you plenty of ways to upload new objects using Multipart upload, AWS Direct Connect, and AWS Import/Export.

As you can imagine, building, running, and adding new features to a system as large and as complex as S3 is no simple task. Here are some of the open positions on the S3 team:

-- Jeff;

Categories: Vendor News

New AWS Premium Support Features: Third-Party Software Support and AWS Trusted Advisor

Mon, 01/30/2012 - 12:01

We have added two new benefits to the Gold and Platinum levels of AWS Premium Support. The following features are now in beta testing:

  • We now offer third-party support for popular operating systems running on Amazon EC2. We also support a number of pieces of system software.
  • The AWS Trusted Advisor monitors your use of AWS and recommends configuration changes and new services that may help save you money, improve system performance, and close security gaps.

Third-Party Support
If you have Gold or Platinum Premium Support, you can now ask questions related to a number of popular operating systems including Microsoft Windows, Ubuntu, Red Hat Linux, SuSE Linux, and the Amazon Linux AMI. You can ask us about system software including the Apache and IIS web servers, the Amazon SDKs, Sendmail, Postfix, and FTP. A team of AWS support engineers is ready to help with setup, configuration, and troubleshooting of these important infrastructure components.

We are also enabling the use of desktop sharing software, giving you the option to share your desktop with a support engineer as needed.

AWS Trusted Advisor
AWS Trusted Advisor draws upon best practices learned from AWS’ aggregated operational history of serving hundreds of thousands of AWS customers. The AWS Trusted Advisor inspects your AWS environment and makes recommendations when opportunities exist  to save money, improve system performance, or close security gaps. The initial release of the AWS Trusted Advisor includes eight separate checks; we'll be adding more throughout 2012.

The checks are grouped into three families: fault tolerance checks, security audits, and cost optimizations. Here is the initial set of eight checks performed by AWS Trusted Advisor:

  1. Security Group - Open Ports - This check inspects your security groups and classifies each open port into one of three categories. Green ports for common protocols such as SSH and HTTP, Red ports for protocols that don't usually need to be open on internet-facing servers (e.g. port 1443 for Microsoft SQL Server), and Yellow for all others.
  2. Security Group - CIDR Rules - This check inspects your security groups for rules that have errors which might allow more access than may be intended. Some people (me included) often confuse "/0"and "/32" addresses.
  3. Reserved Instance Recommendations - This check looks at your billing and instance utilization history and recommends optimizations that could be achieved by the purchase of Reserved Instances.
  4. Unused Elastic IP Addresses - Elastic IP Addresses that are not attached to an Amazon EC2 instance will be flagged since you pay for them if you don't use them.
  5. EBS Snapshots - This check looks for EBS volumes that don't have a snapshot, or which have only aged snapshots. The Red/Yellow/Green model is also used here: Red if there is no snapshot at all or if the most recent one is very old; Yellow if the most recent snapshot is somewhat old, and Green if the most recent snapshot is reasonably recent (we're still fine tuning the thresholds for these checks).
  6. Amazon EC2 Availability Zone Balance - This check identifies situations where Amazon EC2 instances are not evenly distributed across Availability Zones, or if (even worse) they are all in the same Availability Zone. The Red/Yellow/Green model is used to characterize the situation.
  7. Elastic Load Balancer Optimization - This check determines whether instance allocation across Availability Zones for each Load Balancer is balanced.
  8. Service Limits - This check gives you visibility into the per-account limits and usage of things like instances, Elastic IP addresses, and other resources (in almost every case, limits can be raised using the appropriate online form).

AWS Trusted Advisor does not have access to customer data. Recommendations are made by analyzing information gathered using a constrained set of internal and documented AWS API calls.

Here's a diagram to show you how it works:

Advice from the AWS Trusted Advisor is made available in several different forms. For certain issues, we will proactively create support cases and notify you that a given check has identified an opportunity for improvement. The AWS Support Engineers are also available to review AWS Trusted Advisor recommendations any time you call in for support. In the future a regular scorecard report will be available, as will an AWS Trusted Advisor Console with support for viewing, running, customizing, and even opting out of certain checks as desired.

These new features are available for all Gold and Platinum customers. What do you think? Leave a comment and let me know.

-- Jeff;

Categories: Vendor News

New Tagging for Auto Scaling Groups

Thu, 01/26/2012 - 22:29

You can now add up to 10 tags to any of your Auto Scaling Groups. You can also, if you'd like, propagate the tags to the EC2 instances launched from your groups.

Adding tags to your Auto Scaling groups will make it easier for you to identify and distinguish them.

Each tag has a name, a value, and an optional propagation flag. If the flag is set, then the corresponding tag will be applied to EC2 instances launched from the group. You can use this feature to label or distinguish instances created by distinct Auto Scaling groups. You might be using multiple groups to support multiple scalable applications, or multiple scalable tiers or components of a single application. Either, way the tags can help you to keep your instances straight.

Read more in the newest version of the Auto Scaling Developer Guide.

-- Jeff;

Categories: Vendor News

AWS HowTo: Using Amazon Elastic MapReduce with DynamoDB (Guest Post)

Wed, 01/25/2012 - 20:42

Today's guest blogger is Adam Gray. Adam is a Product Manager on the Elastic MapReduce Team.

-- Jeff;

Apache Hadoop and NoSQL databases are complementary technologies that together provide a powerful toolbox for managing, analyzing, and monetizing Big Data. That’s why we were so excited to provide out-of-the-box Amazon Elastic MapReduce (Amazon EMR) integration with Amazon DynamoDB, providing customers an integrated solution that eliminates the often prohibitive costs of administration, maintenance, and upfront hardware. Customers can now move vast amounts of data into and out of DynamoDB, as well as perform sophisticated analytics on that data, using EMR’s highly parallelized environment to distribute the work across the number of servers of their choice. Further, as EMR uses a SQL-based engine for Hadoop called Hive, you need only know basic SQL while we handle distributed application complexities such as estimating ideal data splits based on hash keys, pushing appropriate filters down to DynamoDB, and distributing tasks across all the instances in your EMR cluster.

In this article, I’ll demonstrate how EMR can be used to efficiently export DynamoDB tables to S3, import S3 data into DynamoDB, and perform sophisticated queries across tables stored in both DynamoDB and other storage services such as S3.

We will also use sample product order data stored in S3 to demonstrate how you can keep current data in DynamoDB while storing older, less frequently accessed data, in S3. By exporting your rarely used data to Amazon S3 you can reduce your storage costs while preserving low latency access required for high velocity data. Further, exported data in S3 is still directly queryable via EMR (and you can even join your exported tables with current DynamoDB tables).

The sample order data uses the schema below. This includes Order ID as its primary key, a Customer ID field, an Order Date stored as the number of seconds since epoch, and Total representing the total amount spent by the customer on that order. The data also has folder-based partitioning by both year and month, and you’ll see why in a bit.

Creating a DynamoDB Table
Let’s create a DynamoDB table for the month of January, 2012 named Orders-2012-01. We will specify Order ID as the Primary Key. By using a table for each month, it is much easier to export data and delete tables over time when they no longer require low latency access.

For this sample, a read capacity and a write capacity of 100 units should be more than sufficient. When setting these values you should keep in mind that the larger the EMR cluster the more capacity it will be able to take advantage of. Further, you will be sharing this capacity with any other applications utilizing your DynamoDB table.”

Launching an EMR Cluster
Please follow Steps 1-3 in the EMR for DynamoDB section of the Elastic MapReduce Developer Guide to launch an interactive EMR cluster and SSH to its Master Node to begin submitting SQL-based queries. Note that we recommend you use at least three instances of m1.large size for this sample.

At the hadoop command prompt for the current master node, type hive. You should see a hive prompt: hive>

As no other applications will be using our DynamoDB table, let’s tell EMR to use 100% of the available read throughput (by default it will use 50%). Note that this can adversely affect the performance of other applications simultaneously using your DynamoDB table and should be set cautiously.

SET dynamodb.throughput.read.percent=1.0;

Creating Hive Tables
Outside data sources are referenced in your Hive cluster by creating an EXTERNAL TABLE. First let’s create an EXTERNAL TABLE for the exported order data in S3. Note that this simply creates a reference to the data, no data is yet moved.

CREATE EXTERNAL TABLE orders_s3_export ( order_id string, customer_id string, order_date int, total double )
PARTITIONED BY (year string, month string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3://elastic-mapreduce/samples/ddb-orders' ;

You can see that we specified the data location, the ordered data fields, and the folder-based partitioning scheme.

Now let’s create an EXTERNAL TABLE for our DynamoDB table.

CREATE EXTERNAL TABLE orders_ddb_2012_01 ( order_id string, customer_id string, order_date bigint, total double )
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES (
"dynamodb.table.name" = "Orders-2012-01",
"dynamodb.column.mapping" = "order_id:Order ID,customer_id:Customer ID,order_date:Order Date,total:Total"
);

This is a bit more complex. We need to specify the DynamoDB table name, the DynamoDB storage handler, the ordered fields, and a mapping between the EXTERNAL TABLE fields (which can’t include spaces) and the actual DynamoDB fields.

Now we’re ready to start moving some data!

Importing Data into DynamoDB
In order to access the data in our S3 EXTERNAL TABLE, we first need to specify which partitions we want in our working set via the ADD PARTITION command. Let’s start with the data for January 2012.

ALTER TABLE orders_s3_export ADD PARTITION (year='2012', month='01') ;

Now if we query our S3 EXTERNAL TABLE, only this partition will be included in the results. Let’s load all of the January 2012 order data into our external DynamoDB Table. Note that this may take several minutes.

INSERT OVERWRITE TABLE orders_ddb_2012_01
SELECT order_id, customer_id, order_date, total
FROM orders_s3_export ;

Looks a lot like standard SQL, doesn’t it?

Querying Data in DynamoDB Using SQL
Now let’s find the top 5 customers by spend over the first week of January. Note the use of unix-timestamp as order_date is stored as the number of seconds since epoch.

SELECT customer_id, sum(total) spend, count(*) order_count
FROM orders_ddb_2012_01
WHERE order_date >= unix_timestamp('2012-01-01', 'yyyy-MM-dd')
AND order_date < unix_timestamp('2012-01-08', 'yyyy-MM-dd')
GROUP BY customer_id
ORDER BY spend desc
LIMIT 5 ;

Querying Exported Data in S3
It looks like customer: ‘c-2cC5fF1bB’ was the biggest spender for that week. Now let’s query our historical data in S3 to see what that customer spent in each of the final 6 months of 2011. Though first we will have to include the additional data into our working set. The RECOVER PARTITIONS command makes it easy to

ALTER TABLE orders_s3_export RECOVER PARTITIONS;

We will now query the 2011 exported data for customer ‘c-2cC5fF1bB’ from S3. Note that the partition fields, both month and year, can be used in your Hive query.

SELECT year, month, customer_id, sum(total) spend, count(*) order_count
FROM orders_s3_export
WHERE customer_id = 'c-2cC5fF1bB'
AND month >= 6
AND year = 2011
GROUP BY customer_id, year, month
ORDER by month desc;

Exporting Data to S3
Now let’s export the January 2012 DynamoDB table data to a different S3 bucket owned by you (denoted by YOUR BUCKET in the command). We’ll first need to create an EXTERNAL TABLE for that S3 bucket. Note that we again partition the data by year and month.

CREATE EXTERNAL TABLE orders_s3_new_export ( order_id string, customer_id string, order_date int, total double )
PARTITIONED BY (year string, month string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION 's3://YOUR BUCKET';

Now export the data from DynamoDB to S3, specifying the appropriate partition values for that table’s month and year.

INSERT OVERWRITE TABLE orders_s3_new_export
PARTITION (year='2012', month='01')
SELECT * from orders_ddb_2012_01;

Note that if this was the end of a month and you no longer needed low latency access to that table’s data, you could also delete the table in DynamoDB. You may also now want to terminate your job flow from the EMR console to ensure you do not continue being charged.

That’s it for now. Please visit our documentation for more examples, including how to specify the format and compression scheme for your exported files.

-- Adam Gray, Product Manager, Amazon Elastic MapReduce.

Categories: Vendor News

The AWS Storage Gateway - Integrate Your Existing On-Premises Applications with AWS Cloud Storage

Wed, 01/25/2012 - 03:56

Warning: If you don't have a data center, or if all of your IT infrastructure is already in the cloud, you may not need to read this post! But feel free to pass it along to your friends and colleagues.

The Storage Gateway
Our new AWS Storage Gateway service connects an on-premise software appliance with cloud-based storage to integrate your existing on-premises applications with the AWS storage infrastructure in a seamless, secure, and transparent fashion. Watch this video for an introduction:

Data stored in your current data center can be backed up to Amazon S3, where it is stored as Amazon EBS snapshots. Once there, you will benefit from S3's low cost and intrinsic redundancy. In the event you need to retrieve a backup of your data, you can easily restore these snapshots locally to your on-premises hardware. You can also access them as Amazon EBS volumes, enabling you to easily mirror data between your on-premises and Amazon EC2-based applications.

You can install the AWS Storage Gateway's software appliance on a host machine in your data center. Here's how all of the pieces fit together:

 

The AWS Storage Gateway allows you to create storage volumes and attach these volumes as iSCSI devices to your on-premises application servers. The volumes can be Gateway-Stored (right now) or Gateway-Cached (soon) volumes. Gateway-Stored volumes retain a complete copy of the volume on the local storage attached to the on-premises host, while uploading backup snapshots to Amazon S3. This provides low-latency access to your entire data set while providing durable off-site backups. Gateway-Cached volumes will use the local storage as a cache for frequently-accessed data; the definitive copy of the data will live in the cloud. This will allow you to offload your storage to Amazon S3 while preserving low-latency access to your active data.

Gateways can connect to AWS directly or through a local proxy. You can connect through AWS Direct Connect if you would like, and you can also control the amount of inbound and outbound bandwidth consumed by each gateway. All data is compressed prior to upload.

Each gateway can support up to 12 volumes and a total of 12 TB of storage. You can have multiple gateways per account and you can choose to store data in our US East (Northern Virginia), US West (Northern California), US West (Oregon), EU (Ireland), Asia Pacific (Singapore), or Asia Pacific (Tokyo) Regions.

The first release of the AWS Storage Gateway takes the form of a VM image for VMware ESXi 4.1 (we plan on supporting other virtual environments in the future). Adequate local disk storage, either Direct Attached or SAN (Storage Area Network), is needed for your application storage (used by your iSCSI storage volumes) and working storage (data queued up for writing to AWS). We currently support mounting of our iSCSI storage volumes using the Microsoft Windows and Red Hat iSCSI Initiators.

Up and Running
During the installation and configuration process you will be able to create up to 12 iSCSI storage volumes per gateway. Once installed, each gateway will automatically download, install, and deploy updates and patches. This activity takes place during a maintenance window that you can set on a per-gateway basis.

The AWS Management Console includes complete support for the AWS Storage Gateway. You can create volumes, create and restore snapshots, and establish a schedule for snapshots. Snapshots can be scheduled at 1, 2, 4, 8, 12, or 24 hour intervals. Each gateway reports a number of metrics to Amazon CloudWatch for monitoring.

The snapshots are stored as Amazon EBS (Elastic Block Store) snapshots. You can create an EBS volume using a snapshot of one of your local gateway volumes, or vice versa. Does this give you any interesting ideas?

The Gateway in Action
I expect the AWS Storage Gateway will be put to use in all sorts of ways. Some that come to mind are:

  • Disaster Recovery and Business Continuity - You can reduce your investment in hardware set aside for Disaster Recovery using a cloud-based approach. You can send snapshots of your precious data to the cloud on a regular and frequent basis and you can use our VM Import service to move your virtual machine images to the cloud.
  • Backup - You can back up local data to the cloud without worrying about running out of storage space. It is easy to schedule the backups, and you don't have to arrange to ship tapes off-site or manage your own infrastructure in a second data center.
  • Data Migration - You can now move data from your data center to the cloud, and back, with ease.

Security Considerations
We believe that the AWS Storage Gateway will be at home in the enterprise, so I'll cover the inevitable security questions up front. Here are the facts:

  • Data traveling between AWS and each gateway is protected via SSL.
  • Data at rest (stored in Amazon S3) is encrypted using AES-256.
  • The iSCSI initiator authenticates itself to the target using CHAP (Challenge-Handshake Authentication protocol).

Costs
All AWS users are eligible for a free trial of the AWS Storage Gateway. After that, there is a charge of $125 per month for each activated gateway. The usual EBS snapshot storage rates apply ($0.14 per Gigabyte-month in the US-East Region), as do the usual AWS prices for outbound data transfer (there's no charge for inbound data transfer). More pricing information can be found on the Storage Gateway Home Page. If you are eligible for the AWS Free Usage Tier, you get up to 1 GB of free EBS snapshot storage per month as well as 15 GB of outbound data transfer.

On the Horizon
As I mentioned earlier, the first release of the AWS Storage Gateway supports Gateway-Stored volumes. We plan to add support for Gateway-Cached volumes in the coming months.

We'll add more features to our roadmap as soon as our users (this means you) start to use the AWS Storage Gateway and send feedback our way.

Learn More
You can visit the Storage Gateway Home Page or read the Storage Gateway User Guide to learn more.

We will be hosting a Storage Gateway webinar on Thursday, February 23rd. Please attend if you would like to learn more about the Storage Gateway and how it can be used for backup, disaster recover, and data mirroring scenarios. The webinar is free and open to all, but space is limited and you need to register!

-- Jeff;

Categories: Vendor News

Launch Relational Database Service Instances in the Virtual Private Cloud

Tue, 01/24/2012 - 20:34

You can now launch Amazon Relational Database Service (RDS) DB instances inside of a Virtual Private Cloud (VPC).

Some Background
The Relational Database Service takes care of all of the messiness associated with running a relational database. You don't have to worry about finding and configuring hardware, installing an operating system or a database engine, setting up backups, arranging for fault detection and failover, or scaling compute or storage as your needs change.

The Virtual Private Cloud lets you create a private, isolated section of the AWS Cloud. You have complete control over IP address ranges, subnetting, routing tables, and network gateways to your own data center and to the Internet.

Here We Go
Before you launch an RDS DB Instance inside of a VPC, you must first create the VPC and partition its IP address range in to the desired subnets. You can do this using the VPC wizard pictured above, the VPC command line tools, or the VPC APIs.

Then you need to create a DB Subnet Group. The Subnet Group should have at least one subnet in each Availability Zone of the target Region; it identifies the subnets (and the corresponding IP address ranges) where you would like to be able to run DB Instances within the VPC. This will allow a Multi-AZ deployment of RDS to create a new standby in another Availability Zone should the need arise. You need to do this even for Single-AZ deployments, just in case you want to convert them to Multi-AZ at some point.

You can create a DB Security Group, or you can use the default. The DB Security Group gives you control over access to your DB Instances; you can allow access from EC2 instances with specific EC2 Security Group or VPC Security Groups membership, or from designated ranges of IP addresses. You can also use VPC subnets and the associated network Access Control Lists (ACLs) if you'd like. You have a lot of control and a lot of flexibility.

The next step is to launch a DB Instance within the VPC while referencing the DB Subnet Group and a DB Security Group. With this release, you are able to use the MySQL DB engine (we plan to additional options over time). The DB Instance will have an Elastic Network Interface using an IP address selected from your DB Subnet Group. You can use the IP address to reach the instance if you'd like, but we recommend that you use the instance's DNS name instead since the IP address can change during failover of a Multi-AZ deployment.

Upgrading to VPC
If you are running an RDB DB Instance outside of a VPC, you can snapshot the DB Instance and then restore the snapshot into the DB Subnet Group of your choice. You cannot, however, access or use snapshots taken from within a VPC outside of the VPC. This is a restriction that we have put in to place for security reasons.

Use Cases and Access Options
You can put this new combination (RDS + VPC) to use in a variety of ways. Here are some suggestions:

  • Private DB Instances Within a VPC - This is the most obvious and straightforward use case, and is a perfect way to run corporate applications that are not intended to be accessed from the Internet.
  • Public facing Web Application with Private Database - Host the web site on a public-facing subnet and the DB Instances on a private subnet that has no Internet access. The application server and the RDB DB Instances will not have public IP addresses.

Your Turn
You can launch RDS instances in your VPCs today in all of the AWS Regions except AWS GovCloud (US). What are you waiting for?

-- Jeff;

 

Categories: Vendor News

AWS Toolkits for Eclipse and Visual Studio Now Support DynamoDB

Fri, 01/20/2012 - 15:12

The AWS Toolkit for Eclipse and and the AWS Toolkit for Visual Studio now support Amazon DynamoDB.You can create tables, insert and edit data, initiate table scans, and more.

Here are some screen shots from the AWS Toolkit for Visual Studio.

Create a table:

Edit a multi-valued attribute:

Set up a table scan:

The AWS Toolkit for Visual Studio also contains the latest and greatest version of the AWS SDK for .NET. This version of the SDK includes support for Amazon DynamoDB, in the form of the Amazon.DynamoDB.DocumentModel and Amazon.DynamoDB.TableModel classes and namespaces. More information about the updates to the SDK can be found in the release notes.

Similarly, the AWS Toolkit for Eclipse contains the latest and greatest version of the AWS SDK for Java. This SDK also includes support for Amazon DynamoDB, Per the release notes, you can use the AmazonDynamoDBClient object to send requests directly to Amazon DynamoDB, or you can use the high-level API in the AWS SDK for Java to annotate your Java objects and automatically map them into Amazon DynamoDB.

-- Jeff;

Categories: Vendor News

Identity Federation to the AWS Management Console

Thu, 01/19/2012 - 22:47

In August, we announced that AWS Identity and Access Management (IAM) added support for Identity Federation. This enabled customers to use their existing identities (e.g. users) to securely access AWS APIs and resources using IAM's fine-grained access controls, without the need to create an IAM user for each identity.

Today we are announcing that we have extended IAM’s Identity Federation functionality to also enable federated users to access the AWS Management Console. This allows you to enable your employees to sign in once to your corporate directory, and then use the AWS Management Console without having to sign in to AWS, providing single sign-on access to AWS.

In my previous post on the topic of Identity Federation, I discussed how you could setup an identity broker, which calls our Security Token Service (STS), requesting temporary security credentials to provide your users access to AWS. You explicitly specify the permissions that these temporary credentials give your users, as well as control the amount of time (1 to 36 hours) these credentials are valid for. Well, these same temporary security credentials can now also be used to access the AWS Management Console.

Here's the basic flow:

User signs in to the enterprise network with their enterprise credentials. User browses to an internal site and clicks on Sign in to AWS Management Console.

Page calls identity broker. Identity broker validates access rights and provides temporary security credentials which includes the user's permissions to access AWS. The page includes these temporary security credentials as part of the sign-in request to AWS. User is logged in to the AWS Management Console with the appropriate IAM policy.

If you have already built an identity broker, perhaps using our sample application, to enable Identity Federation to AWS service APIs for users in your enterprise directory, you’re already most of the way there. All you need to do is implement an internal web page with redirect links to the AWS Management Console, and include the temporary security credentials as part of the sign in request. Below is some simple Ruby code sample that shows how to do just that (just replace the highlighed items with your own identifiers and URLs):

  1. require 'rubygems'
  2. require 'json'
  3. require 'open-uri'
  4. require 'cgi'
  5. require 'aws-sdk'
  6.  
  7. # The temporary credentials will normally come from your identity
  8. # broker, but for simplicity we create them in place
  9. sts = AWS::STS.new(:access_key_id => "*** Your AWS Access Key ID ***",
  10.   :secret_access_key => "*** Your AWS Secret Access Key ***")
  11.  
  12. # A sample policy for accessing SNS in the console.
  13. policy = AWS::STS::Policy.new
  14. policy.allow(:actions => "sns:*",:resources => :any)
  15.  
  16. session = sts.new_federated_session(
  17.   "UserName",
  18.   :policy => policy,
  19.   :duration => 3600)
  20.  
  21.  
  22. # The issuer parameter specifies your internal sign-in
  23. # page, for example https://mysignin.internal.mycompany.com/.
  24. # The console parameter specifies the URL to the destination tab of the
  25. # AWS Management Console. This example goes to the sns console.
  26. # The signin parameter is the URL to send the request to.
  27. issuer_url = "https://mysignin.internal.mycompany.com/"
  28. console_url = "https://console.aws.amazon.com/sns"
  29. signin_url = "https://signin.aws.amazon.com/federation"
  30.  
  31. # Create the signin token using temporary credentials,
  32. # including the Access Key ID, Secret Access Key, and security token.
  33.  
  34. session_json = {
  35.   :sessionId => session.credentials[:access_key_id],
  36.   :sessionKey => session.credentials[:secret_access_key],
  37.   :sessionToken => session.credentials[:session_token]
  38. }.to_json
  39.  
  40. get_signin_token_url = signin_url + "?Action=getSigninToken&SessionType=json&Session=" + CGI.escape(session_json)
  41. returned_content = URI.parse(get_signin_token_url).read
  42. signin_token = JSON.parse(returned_content)['SigninToken']
  43. signin_token_param = "&SigninToken=" + CGI.escape(signin_token)
  44.  
  45. # The issuer parameter is optional, but recommended. Use it to direct users
  46. # to your sign-in page when their session expires.
  47. issuer_param = "&Issuer=" + CGI.escape(issuer_url)
  48. destination_param = "&Destination=" + CGI.escape(console_url)
  49.  
  50. login_url = signin_url + "?Action=login" + signin_token_param + issuer_param + destination_param
  51.  

You can control the user name displayed in the upper right corner of the AWS Management Console when your user logs in. You can also optionally provide an "Issuer" URL when signing your users in. This URL will then be displayed to the user when their credentials expire, so they can re-authenticate with your identity system before continuing to use the AWS Console.

The following services support Identity Federation to the AWS Management Console today: Amazon EC2, Amazon S3, Amazon SNS, Amazon SQS, Amazon VPC, Amazon CloudFront, Amazon Route 53, Amazon CloudWatch, Amazon RDS, Amazon ElastiCache, Amazon SES, Elastic Load Balancing, and IAM. We'll of course be adding support for additional service consoles over time (the busy Amazon DynamoDB team is already working on it!).

-- Jeff;

Categories: Vendor News

Guest Post: Geo-Blocking Content With Amazon CloudFront

Thu, 01/19/2012 - 20:15

Today's guest blogger is Nihar Bihani, a Product Manager on the Amazon CloudFront team.

-- Jeff;

After we launched Amazon CloudFront in November 2008, customers began asking for a way to block access to their content being delivered. We heard a variety of reasons why customers wanted to have detailed control over who is able to download their files from Amazon CloudFront. Some of the more common use cases we heard included customers wanting the ability to block content delivered by Amazon CloudFront so they could sell digital goods only to paying customers on their website, deliver training materials only to their employees and offer secure video streaming for their pay-per-view or subscription access model. We listened to their feedback and we launched Amazon CloudFront’s private content feature in late 2009 for download content and in early 2010 for streaming content. These features help customers protect their content by restricting access based on date ranges, IP addresses, and IP address ranges.

More recently, we heard Amazon CloudFront customers ask for another method of blocking access to their content based on the geographic location of their viewers. One use case is a video publisher who may only have rights to distribute video to users in a single country and needs a way to prevent users who aren’t in that country from accessing their video. Another is a software delivery company that needs to limit the downloading of their content to certain territories because of licensing terms that prevent users in certain countries from downloading their software. We’ll refer to blocking access to certain countries or territories as geo-restriction.

As a result of this customer feedback, we recently published a tutorial that shows how to add geo-restriction logic to your web application using Amazon CloudFront’s private content feature in combination with a third party geo-location product. The geo-location product translates your end user's client IP address into an estimation of the end-user’s location. The tutorial shows you how to consume this location data and issue an Amazon CloudFront private content URL based on the results. We’ve included sample code in Java, .Net, and PHP that work with two different geo-location products.

Here's how it works:

  1. End user requests a webpage on your site.
  2. Your web server sends the end user’s IP address to a geo-location service.
  3. Geo-location service returns the geographic location for the end user.
  4. Your web server determines if the end user should have access to your content on Amazon CloudFront. If so, your webserver generates an Amazon CloudFront signed URL.
  5. End user browser requests the content from Amazon CloudFront using the signed URL.

Using Amazon CloudFront and a third-party geo-location service to restrict access to your content from your application also provides you with control over your end user's experience if they are restricted from access. For end users whose access is blocked, your application can display a meaningful message instead of returning an error code. You can also customize the error message you display for your end users according to their location.

You can find the tutorial here. Please take a look at let us know what you think.

Nihar Bihani
Product Manager - Amazon CloudFront

Categories: Vendor News

Amazon DynamoDB - Internet-Scale Data Storage the NoSQL Way

Wed, 01/18/2012 - 12:32

We want to make it very easy for you to be able to store any amount of semistructured data and to be able to read, write, and modify it quickly, efficiently, and with predictable performance. We don't want you to have to worry about servers, disks, replication, failover, monitoring, software installation, configuration, or updating, hardware upgrades, network bandwidth, free space, sharding, rearchitecting, or a host of other things that will jump up and bite you at the worst possible time.

We want you to think big, to dream big dreams, and to envision (and then build) data-intensive applications that can scale from zero users up to tens or hundreds of millions of users before you know it. We want you to succeed, and we don't want your database to get in the way. Focus on your app and on building a user base, and leave the driving to us.

Sound good?

Hello, DynamoDB
Today we are introducing Amazon DynamoDB, our Internet-scale NoSQL database service. Built from the ground up to be efficient, scalable, and highly reliable, DynamoDB will let you store as much data as you want and to access it as often as you'd like, with predictable performance brought on by the use of Solid State Disk, better known as SSD.

DynamoDB works on the basis of provisioned throughput. When you create a DynamoDB table, you simply tell us how much read and write throughput you need. Behind the scenes we'll set things up so that we can meet your needs, while maintaining latency that's in the single-digit milliseconds. Later, if your needs change, you can simply turn the provisioned throughput dial up (or down) and we'll adjust accordingly. You can do this online, with no downtime and with no impact on the overall throughput. In other words, you can scale up even when your database is handling requests.

We've made DynamoDB ridiculously easy to use. Newly created tables will usually be ready to use within a minute or two. Once the table is ready, you simply start storing data (as much as you want) into it, paying only for the storage that you use (there's no need to pre-provision storage).Again, behind the scenes, we'll take care of provisioning adequate storage for you.

Each table must have a primary index. In this release, you can choose between two types of primary keys: Simple Hash Keys and Composite Hash Key with Range Keys.

  • Simple Hash Keys give DynamoDB the Distributed Hash Table abstraction and are used to index on a unique key. The key is hashed over multiple processing and storage partitions to optimally distribute the workload.
  • Composite Hash Keys with Range Keys give you the ability to create a primary key that is composed of two attributes -- a hash attribute and a range attribute. When you query against this type of key, the hash attribute must be uniquely matched but a range (low to high) can be specified for the  range attribute. You can use this to run queries such as "all orders from Jeff in the last 24 hours."

Each item in a DynamoDB table consists of a set of key/value pairs. Each value can be a string, a number, a string set, or a number set. When you choose to retrieve (get) an item, you can choose between a strongly consistent read and an eventually consistent read based on your needs. The eventually consistent reads consume half as many resources, so there's a throughput consideration to think about.

Sounds great, you say, but what about reliability and data durability? Don't worry, we've got that covered too! When you create a DynamoDB table in a particular region, we'll synchronously replicate your data across servers in multiple zones. You'll never know about (or be affected by) hardware or facility failures. If something breaks, we'll get the data from another server.

I can't stress the operational performance of DynamoDB enough. You can start small (say 5 reads per second) and scale up to 50, 500, 5000, or even 50,000 reads per second. Again, online, and with no changes to your code. And (of course) you can do the same for writes. DynamoDB will grow with you, and it is not going to get between you and success.

As part of the AWS Free Usage Tier, you get 100 MB of free storage, 5 writes per second, and 10 strongly consistent reads per second (or 20 eventually consistent reads per second). Beyond that, pricing is based on how much throughput you provision and how much data you store. As is always the case with AWS, there's no charge for bandwidth between an EC2 instance and a DynamoDB table in the same Region.

You can create up to 256 tables, each provisioned for 10,000 reads and 10,000 writes per seconds. I cannot emphasize the next point strongly enough: We are ready, willing, and able to increase any of these values; simply click here and provide us with some additional information. Our early customers have, in several cases, already exceeded the default limits by an order of magnitude!

DynamoDB from the AWS Management Console
The AWS Management Console has a new DynamoDB tab. You can create a new table, provision the throughput, set up the index, and configure CloudWatch alarms with a few clicks:

You can enter your throughput requirements manually:

Or you can use the calculator embedded in the dialog:

You can easily set CloudWatch alarms that will fire when you are consuming more than a specified percentage of the throughput that you have provisioned for the table:

You can use the CloudWatch metrics to see when it is time to add additional read or write throughput:

You can easily increase or decrease the provisioned throughput:

Programming With DynamoDB
The AWS SDKs have been updated and now include complete support for DynamoDB. Here are some examples that I put together using the AWS SDK for PHP.

The first step is to include the SDK and create a reference object:

require_once("sdk.class.php");
$DDB = new AmazonDynamoDB(array('credentials' => 'production'));

Creating a table requires three arguments: a table name, a key specification, and a throughput specification:

// Create a table
$Schema = array('HashKeyElement' =>
                array('AttributeName' => 'RecordId',
                      'AttributeType' => AmazonDynamoDB::TYPE_STRING));

$Throughput = array('ReadsPerSecond' => 5, 'WritesPerSecond' => 5);

$Res = $DDB->create_table(array('TableName' => 'Sample',
                                'KeySchema' => $Schema,
                                'ProvisionedThroughput' => $Throughput));

After create_table returns, the table's status will be CREATING. It will transition to ACTIVE when the table is provisioned and ready to accept data. You can use the describe_table function to get the status and other information about the table:

$Res = $DDB->describe_table(array('TableName' => 'Sample'));
print_r($Res->body->Table);

Here's the result as a PHP object:

CFSimpleXML Object
(
    [CreationDateTime] => 1324673829.32
    [ItemCount] => 0
    [KeySchema] => CFSimpleXML Object
        (
            [HashKeyElement] => CFSimpleXML Object
                (
                    [AttributeName] => RecordId
                    [AttributeType] => S
                )

        )

    [ProvisionedThroughput] => CFSimpleXML Object
        (
            [ReadsPerSecond] => 5
            [WritesPerSecond] => 5
        )

    [TableName] => Sample
    [TableSizeBytes] => 0
    [TableStatus] => ACTIVE
)

It is really easy to insert new items. You need to specify the data type of each item; here's how you do that (the other data type constants are TYPE_ARRAY_OF_STRINGS and TYPE_ARRAY_OF_NUMBERS):

for ($i = 1; $i < 100; $i++)
{
  print($i);
  $Item = array('RecordId' => array(AmazonDynamoDB::TYPE_STRING => (string) $i),
                'Square'   => array(AmazonDynamoDB::TYPE_NUMBER => (string) ($i * $i)));

  $Res = $DDB->put_item(array('TableName' => 'Sample', 'Item' => $Item));
}

Retrieval by the RecordId key is equally easy:

for ($i = 1; $i < 100; $i++)
{
  $Key = array('HashKeyElement' => array(AmazonDynamoDB::TYPE_STRING => (string) $i));

  $Item = $DDB->get_item(array('TableName' => TABLE,
                               'Key'       => $Key));

  print_r($Item->body->Item);
}

Each returned item looks like this as a PHP object:

CFSimpleXML Object
(
    [RecordId] => CFSimpleXML Object
        (
            [S] => 44
        )

    [Square] => CFSimpleXML Object
        (
            [N] => 1936
        )

)

The DynamoDB API also includes query and scan functions. The query function queries primary key attribute values and supports the use of comparison operators. The scan function scans the entire table with optional filtering of the results of the scan. Queries are generally more efficient than scans.

You can also update items, retrieve multiple items, delete items, or delete multiple items. DynamoDB includes conditional updates (to ensure that some other write hasn't occurred within a read/modify/write operation as well as atomic increment and decrement operations). Read more in the Amazon DynamoDB Developer Guide.

And there you have it, our first big release of 2012. I would enjoy hearing more about how you plan to put DynamoDB to use in your application. Please feel free to leave a comment on the blog.

-- Jeff;

 

 

Categories: Vendor News

AWS Free Usage Tier now Includes Microsoft Windows on EC2

Mon, 01/16/2012 - 02:02

The AWS Free Usage Tier now allows you to run Microsoft Windows Server 2008 R2 on an EC2 t1.micro instance for up to 750 hours per month. This benefit is open to new AWS customers and to those who are already participating in the Free Usage Tier, and is available in all AWS Regions with the exception of GovCloud. This is an easy way for Windows users to start learning about and enjoying the benefits of cloud computing with AWS.

The micro instances provide a small amount of consistent processing power and the ability to burst to a higher level of usage from time to time. You can use this instance to learn about Amazon EC2, support a development and test environment, build an AWS application, or host a web site (or all of the above). We've fine-tuned the micro instances to make them even better at running Microsoft Windows Server.

You can launch your instance from the AWS Management Console:

We have lots of helpful resources to get you started:

Along with 750 instance hours of Windows Server 2008 R2 per month, the Free Usage Tier also provides another 750 instance hours to run Linux (also on a t1.micro), Elastic Load Balancer time and bandwidth, Elastic Block Storage, Amazon S3 Storage, and SimpleDB storage, a bunch of Simple Queue Service and Simple Notification Service requests, and some CloudWatch metrics and alarms (see the AWS Free Usage Tier page for details). We've also boosted the amount of EBS storage space offered in the Free Usage Tier to 30GB, and we've doubled the I/O requests in the Free Usage Tier, to 2 million.

I look forward to hearing more about your experience with this new offering. Please feel free to leave a comment!

-- Jeff;

PS - If you want to learn more about what's next in the AWS Cloud, please sign up for our live event.

Categories: Vendor News

AWS Direct Connect - Now Available in Four Additional Locations

Tue, 01/10/2012 - 18:19

AWS Direct Connect lets you create a dedicated network connection between your office, data center, or colocation facility to an AWS Region. You might want to do this for privacy, to reduce your network costs, or to get a more consistent network experience than is possible across the Internet.

We launched AWS Direct Connect in US East (Northern Virginia) this past summer and we expanded it to Silicon Valley shortly thereafter.

Today we are making Direct Connect available in four more locations. Here's the complete list of Regions and the associated data centers:

Two of the locations listed above are not in the same city as the associated AWS Region. These locations provide you with additional flexibility when connecting to AWS from those cities. 

You can initiate the Direct Connect provisioning process by simply filling out a form:

-- Jeff;

Categories: Vendor News

Additional Reserved Instance Options for Amazon RDS

Mon, 01/09/2012 - 17:10

Hot on the heels of our announcement of Additional Reserved Instance Options for Amazon EC2, I would like to tell you about a similar option for the Amazon Relational Database Service.

We have added Light and Heavy Utilization Reserved Instances for the MySQL and Oracle database engines. You can save 30% to 55% of your On-Demand DB Instance costs, depending on your usage.

Light Utilization Reserved Instances offer the lowers upfront payment, and ideal for DB instances that are used sporadically for development and testing, or for short-term projects. You can save up to 30% on a 1-year term and 35% on a 3-year term when compared to the same instance on an On-Demand basis.

Medium Utilization Reserved Instances have a higher upfront payment than Light Utilization Reserved Instances, but a much lower hourly usage fee.  They are suitable for workloads that run most of the time, with some variability in usage. Savings range up to 35% for a 1-year term and 48% for a 3-year term when compared to On-Demand. These are the same Reserved Instances that we have offered since August 2010.

Heavy Utilization Reserved Instances are the best value for steady-state production database instances that are destinated to be running 24x7. With this type of Reserved Instance you pay an upfront fee and a low hourly rate for every hour of the one or three year term. You can save 41% for a 1-year term and 55% for a 3-year term.

These Reserved Instance offerings allow you to optimize your costs depending on your workload. The table below shows which Amazon RDS offerings you can use to lower your RDS costs. For example, if you need a DB instance for 5 months, a Light Utilization Reserved Instance will provide you the lowest effective cost.

  1-Year Term 3-Year Term
On-Demand 1-3 Months 1-4 Months Light Utilization
4-8 Months 5-12 Months Medium Utilization
9-10 Months 13-29 Months Heavy Utilization
11-12 Months 30-36 Months

Learn more about this feature and other RDS pricing options on the Amazon RDS pricing page.

As always, we enjoy lowering our prices so that AWS becomes an even better value for you.

-- Jeff;

Categories: Vendor News

EC2 Instance Status Checks and Reporting

Fri, 01/06/2012 - 19:24

Instance Status Checks
You may remember that we recently introduced EC2 Instance Status Monitoring features to give you better visibility into the status of your AWS resources. We began by providing you with information about operational activities that have been scheduled for your EC2 instances. Since then, we’ve added more functionality.

You can now view status checks to help identify problems that may impair an instance’s ability to run your applications. These status checks are the results of automated tests performed by EC2 on every running instance that detect hardware and software issues. Whether you are running applications on AWS or elsewhere, diagnosing problems quickly and accurately can be difficult. For example, to determine that a faulty boot sequence has crashed before it initialized an instance’s networking stack or that an instance has failed to renew its DHCP lease, it helps to confirm first that the instance is powered on, and all networking equipment is performing as expected.

You have told us that you want to know when problems such as these may affect your instances and that you want to be able to distinguish software problems from issues with the underlying infrastructure. To this end, we are introducing two types of status checks for each of your instances: System status checks and Instance status checks. These checks verify that the instance and the operating system are reachable from our monitoring system.

System status checks detect problems with the underlying EC2 systems that are used by each individual instance. The first System status check we are introducing is a reachability check.

  • The System Reachability check confirms that we are able to get network packets to your instance.

System status problems require AWS involvement to repair. We work hard to fix every one as soon it arises, and we are continually driving down their occurrence. However, we also want you to have enough visibility to decide whether you want to wait for our systems to fix the issue or resolve it yourself (by restarting or replacing an instance).

Instance Status checks detect problems within your instance. Typically, these are problems that you as a customer can fix, for example by rebooting the instance or making changes in your operating system. There is currently one Instance status check.

  • The Instance Reachability check confirms that we are able to deliver network packets to the operating system hosted on your instance.

Over time, we will add to these checks as we continue to improve our detection methods.

We are also introducing a reporting system to allow you to provide us with additional information on the status of your EC2 instances.

You can access this functionality from the new DescribeInstanceStatus and ReportInstanceStatus APIs, the AWS Management Console, and the command-line tools.

Console Support
The status of each of your instances is displayed in the instance list:

The console displays detailed information about the status checks when an instance is selected:

You can use the Submit Feedback button to report discrepancies between the reported status and your own observations or to provide more detail about issues you encounter:

We will use the feedback entered in this form to identify issues that might be affecting multiple AWS customers and improve our detection systems accordingly.

Update: A few people have emailed me to ask about the new Status Checks column in the Console's instance list. If you don't see it, click on the Show/Hide button and make sure that the Status Checks column is checked:

-- Jeff;

Categories: Vendor News

How Collections Work in the AWS SDK for Ruby

Wed, 01/04/2012 - 14:01

Today we have a guest blog post from Matty Noble, Software Development Engineer, SDKs and Tools Team. 

- rodica


We've seen a few questions lately about how to work with collections of resources in the SDK for Ruby, so I'd like to take a moment to explain some of the common patterns and how to use them. There are many different kinds of collections in the SDK. To keep thing simple, I'll focus on Amazon EC2, but most of what you'll see here applies to other service interfaces as well.

Before we do anything else, let's start up an IRB session and configure a service interface to talk to EC2:

$ irb -r rubygems -r aws-sdk > ec2 = AWS::EC2.new(:access_key_id => "KEY", :secret_access_key => "SECRET")

There are quite a few collections available to us in EC2, but one of the first things we need to do in any EC2 application is to find a machine image (AMI) that we can use to start instances. We can manage the images available to us using the images collection:

> ec2.images => <AWS::EC2::ImageCollection>

When you call this method, you'll notice that it returns very quickly; the SDK for Ruby lazy-loads all of its collections, so just getting the collection doesn't do any work. This is good, because often you don't want to fetch the entire collection. For example, if you know the ID of the AMI you want, you can reference it directly like this:

> image = ec2.images["ami-310bcb58"] => <AWS::EC2::Image id:ami-310bcb58>

Again, this returns very quickly. We've told the SDK that we want ami-310bcb58, but we haven't said anything about what we want to do with it. Let's get the description:

> image.description => "Amazon Linux AMI i386 EBS"

This takes a little longer, and if you have logging enabled you'll see a message like this:

[AWS EC2 200 0.411906] describe_images(:image_ids=>["ami-310bcb58"])

Now that we've said we want the description of this AMI, the SDK will ask EC2 for just the information we need. The SDK doesn't cache this information, so if we do the same thing again, the SDK will make another request. This might not seem very useful at first -- but by not caching, the SDK allows you to do things like polling for state changes very easily. For example, if we want to wait until an instance is no longer pending, we can do this:

> sleep 1 until ec2.instances["i-123"].status != :pending

The [] method is useful for getting information about one resource, but what if we want information about multiple resources? Again, let's look at EC2 images as an example. Let's start by counting the images available to us:

> ec2.images.to_a.size [AWS EC2 200 29.406704] describe_images() => 7677

The to_a method gives us an array containing all of the images. Now, let's try to get some information about these images. All collections include Enumerable, so we can use standard methods like map or inject. Let's try to get all the image descriptions using map:

> ec2.images.map(&:description)

This takes a very long time. Why? As we saw earlier, the SDK doesn't cache anything by default, so it has to make one request to get the list of all images, and then one request for each returned image (in sequence) to get the description. That's a lot of round trips -- and it's mostly wasted effort, because EC2 provides all the information we need in the response to the first call (the one that lists all the images). The SDK doesn't know what to do with that data, so the information is lost and has to be re-fetched image by image. We can get the descriptions much more efficiently like this:

> AWS.memoize { ec2.images.map(&:description) }

AWS.memoize tells the SDK to hold on to all the information it gets from the service in the scope of the block. So when it gets the list of images along with their descriptions (and other information) it puts all that data into a thread-local cache. When we call Image#description on each item in the array, the SDK knows that the data might already be cached (because of the memoize block) so it checks the cache before fetching any information from the service.

We've just scratched the surface of what you can do with collections in the AWS SDK for Ruby. In addition to the basic patterns above, many of our APIs allow for more sophisticated filtering and pagination options. For more information about these APIs, you can take a look at the extensive API reference documentation for the SDK. Also don't hesitate to ask questions or leave feedback in our Ruby development forum.

A note about AWS.memoize

AWS.memoize works with both EC2, IAM and ELB; we'd like to extend it to other services, and we'd also like to hear what you think about it. Is the behavior easy to understand? Does it work well in practice? Where would this feature be most beneficial to your application?

Categories: Vendor News

Help Wanted - Manager and Senior Developers for new AWS Media Product

Wed, 01/04/2012 - 08:27

We are staffing up a brand-new AWS team to take advantage of some really interesting opportunities in the digital media space.

This team is being launched from an existing product. This particular product has seen exponential growth in the size of its user base on an annualized (run rate) basis, along with 30x revenue growth over the same period.

In order to address this opportunity, we need to hire a Senior Development Manager and multiple Senior Developers ASAP. You'll be able to start from scratch, building a large-scale distributed application as part of the Seattle-based team.

The team is currently searching for developers at our "SDE III" level. Successful applicants for an SDE III position typically have six or more years of development experience in the industry, along with a BS or MS in Computer Science. They are able to solve large problems in the face of ambiguity, and are able to work on the architecture and the code. They will also have launched projects of significant complexity in the recent past, and have the ability to balance technical complexity with business value.

The developers on this team will drive the architecture and the technology choices. They'll need to have a broad knowledge of emerging technologies and will know the ins and outs of Java, C or C++, and Linux or Windows. They will also have significant experience with networking, multi-threaded applications, interprocess communication, and the architecture of fault-tolerant systems.

We also need a senior manager of software development to build and run this team of top-performers. The manager will establish a project framework and will also be responsible for putting the right development practices in to place. The manager will also be responsible for providing technical leadership and guidance to the team. Well-qualified applicants will have been managing teams of developers for four or more years, and will have shipped one or more highly available large-scale internet applications.

If you are qualified for one of these positions and you would like to apply, please send your resume to newawsproject@amazon.com.

 -- Jeff;

Categories: Vendor News

Big News Regarding Python, boto, and AWS

Tue, 01/03/2012 - 15:01

We know that developers want to call the AWS APIs from many different languages. Over the last couple of years we have created and delivered SDKs for Java, PHP, .NET, and Ruby. We have also done the same for the iOS and the Android platforms.

We also know that the AWS community has dived in to create their own SDKs for languages and environments that we don't currently support. For example, the AWS SDK for PHP started out as an independent project called CloudFusion. We hired Ryan Parman (originator of the project) a year or two ago and he's become a valued member of the AWS Developer Resources team.

Building on this model, Mitch Garnaat has also joined the team. Mitch has been a member of the AWS community for over 6 years and has made over 2,000 posts to the AWS Developer Forums. He is also the author of boto, the most popular third-party library for accessing AWS, and of the Python and AWS Cookbook.

Boto will continue to exist as an open source project and we will be making official contributions to it.

The AWS Developer Resources team has big plans for 2012 and is hiring accordingly. Here are some of our open positions:

-- Jeff;

Categories: Vendor News

New Features for Amazon SNS - Delivery Policies and Message Formatting

Wed, 12/28/2011 - 21:59

We have added two new features to Amazon SNS to give you additional control over the content and delivery of your messages. As a brief reminder, SNS allows you to create named Topics, subscribe to Topics (with delivery via email, HTTP / HTTPS, an SMS message, or to an SQS queue), and to publish messages to Topics.

SNS Delivery Policies
The SNS Delivery Policies give you options to control the delivery rate and error handling for each SNS endpoint. You can, for example, use a Delivery Policy to avoid overwhelming a particular endpoint with a sudden barrage of messages.

Delivery Policies can be set for Topics and for the endpoints associated with a particular Topic. Each Delivery Policy contains a Retry Policy and a Throttle Policy. With this release, the policies are effective for the HTTP and HTTPS Subscription types.

The Retry Policy can specify the following options:

  • minDelayTarget - Minimum delay for a retry.
  • maxDelayTarget - Maximum delay for a retry.
  • numNoDelayRetries  - Number of retries to be done with no delay (as soon as possible).
  • numMinDelayRetries - Number of retries to be done at minDelayTarget intervals before initiating the backoff function.
  • numMaxDelayRetries - Number of retries to be done at maxTargetDelay during the backoff function.
  • backoffFunction - Model for backoff between retries: Linear, Exponential, or Arithmetic.

There are default, minimum, and maxium values for each option; see the SNS documentation for more information.

The Throttle Policy can specify one option:

  • maxReceivesPerSecond - Maximum number of delivery attempts per second per Subscription.

All attempts to deliver a message are based on an "effective Delivery Policy" which combines the default policy, any policy values set for the Topic, and any policy values set for the Subscription endpoint. Values left unspecified at the Subscription level will be inherited from the Topic's Delivery Policy and then from the default policy.

SNS Message Formatting
This feature gives you the ability to publish a message that contains content that is specific to each type of Subscription. You could, for example, send a short message to an SMS endpoint and a longer message to an email endpoint.

To use this feature, you set the new MessageStructure parameter to "json" when you call the SNS publish function.  The associated message body must contain a JSON object with a default message body and optional message bodies for other protocols:

{
  "default" : "Server busy.",
  "email"   : "Dear Jeff, your server is really busy and you should investigate. Best Regards, AWS.",
  "sms"     : "ALERT! Server Busy!!",
  "https"    : "ServerBusy"
}

The default entry will be used for any protocol (http and sqs in this case) that does not have an entry of its own.

-- Jeff;

PS - Please feel free to share this with your friends via the Like button below!

Categories: Vendor News