How to develop and deploy web-scale apps on AWS – your questions answered
We recently ran a webinar on how to develop and deploy web-scale applications on AWS (you can watch the recording here, or check out the slides here). We received a lot of questions from viewers throughout the session and we unfortunately ran out of time before we could answer them all. So we've gone through each of the questions and answered them below. If you have any more questions, you can always get in touch and we'd be happy to help.
Q: Not so much a question as a comment but it's well worth mentioning that people should use SES for transactional mail delivery when using EC2 instances.
A: This is a good point. If your application needs to send emails like registration sign-ups, for example, use SES. Don't use EC2 instances as you can end up on spam lists.
Q: What do you suggest for users who want to use NFS on AWS?
A: If you want to use NFS to store media files like images, audio or video for use with your web app, use S3 instead. If you can't change your application code to use S3, Amazon provides NFS as a service called EFS (Elastic File System) but this is still in beta and not available in all regions.
The other option is to use a GlusterFS cluster build on EC2 and EBS - this will give you high availability and resiliency. We would recommend not to use NFS if possible and store files in S3 whenever you can.
Q: No mention of AWS Lambda. Is this not useful for scaling?
A: AWS Lambda is an interesting service. It's fairly new but we can see it useful for ad-hoc tasks which don't require servers to be run all the time. You can write a piece of code which can be run based on defined events or requests, like file upload to transcode video etc.
Q: How do you decide whether it's time to upgrade an instance type rather than autoscaling?
A: It's a difference between scaling up or scaling out and it will depend on your application. If your application is CPU bound it's probably better to scale the instance for bigger CPU so it can handle a greater workload. If it's more request-based you would want to have a larger number of instances to lower the latency of requests.
Q: Can we use AWS for a hybrid solution? Can we migrate to AWS and back to on-premise?
A: Yes - you can use software like our Zerto DR service, but there are some limitations.
Q: How should I deploy a scheduler reading a queue from MySQL in RDS with multi-instances? Constraint is that we cannot execute more than 1 instance of the scheduler.
A: We probably need more details to answer this properly, but jobs like this (i.e. cron jobs) are usually deployed to a single management server in a scale-out, web fronted infrastructure. Another possibility is to use something like Lambda.
Q: Having an API deployed on multi-instance using load balancer, that solves the API issue. However, if there is a background process that reacts the queue to process should this be on AWS? If so, then how can it be high availability/reliability?
A: This should be in a separate app tier with multiple workers reading from the queue and running tasks from it. Worker queue model (i.e. SQS) receives tasks and worker tier consumes the tasks.
Q: Do you have any recommendations for the best ways to handle authentication and authorisation of users within a decoupled AWS-based application?
A: It would be completely dependent on your application requirements and we would need some more details to provide available options. If you'd like talk more about this, you can get in touch here.
Q: How do you deal with SQL storing data in a distributed environment that can fail?
A: SQL is hard to run in distributed environments which can fail. The replication process and functionality (only master-slave replication) is not as good as some of the NoSQL databases.
AWS provides RDS – a scalable, fault tolerant SQL store service (Amazon Aurora, MySQL, PostgresSQL, Oracle, MSSQL, MariaDB) – which is basically managed installation of the database with multi A-Z replication and provides failover in case an availability zone goes down.
If your application can be rewritten to use NoSQL, you have couple of options for highly available, highly durable databases. Cassandra from Facebook can be a really good open source choice as it allows you to use ephemeral storage of EC2 and replicate data between servers and data centres.
For more information on how we can help with your AWS projects, or to get a quote, click here.