3 Reasons AWS Lambda Is Not Ready for Prime Time

Chad Lung recently put together a tutorial about writing a Python microservice using AWS Lambda, reachable via HTTP. It’s well written, it’s cogent, and it does a great job of demonstrating how Lambda is cool.

If you’re not familiar with Lambda, it’s a new AWS feature that’s meant to give you a way to quickly write a service and let Amazon worry about all the boilerplate junk that normally goes with standing your service up in a way that people can actually talk to it. You don’t configure subnets or instances or load balancers with Lambda: you just write some code and then tell Amazon to hook you up. It’s a pretty compelling promise.

When we at Datawire tried to actually use Lambda for a real-world HTTP-based microservice shortly before Lung’s tutorial came out, though, we found some uncool things that make Lambda not yet ready for the world we live in:

  • Lambda is a building block, not a tool
  • Lambda is not well documented
  • Lambda is terrible at error handling

Lung skips these uncool things, which makes sense because they’d make the tutorial collapse under its own weight, but you can’t skip them if you want to work in the real world. (Note that if you’re using Lambda for event handling within the AWS world, your life will be easier. But the really interesting case in the microservice world is Lambda and HTTP.)

Lambda is Only a Building Block

It’s not just Lambda, even: AWS’ model is that they provide building blocks, and they expect others to wrap real tools around them. If you try to interact directly with AWS, it’s absurdly manual.

To wit, Lung’s tutorial shows us that manually setting up a Python Lambda is a twenty step process – and that’s a service with exactly one endpoint that uses GET and takes just one argument on the query string. Mind you, about half those steps (8-10, depending) are things you’ll have to repeat for every endpoint you create. If you have even five services, you’re looking not at 20 steps, but 50-60. Imagine 100 services. Imagine how often you’ll have to do this if you’re using versioned endpoints. Does 8-10 manual configuration per endpoint, every time you roll out a new version, sound like fun?

The root of the problem here is that we want a tool (our microservice) but AWS gives us building blocks, and leaves connecting them up to us. The blocks here are Lambda and the API Gateway, and it’s telling that Lung starts his tutorial not by creating a Lambda but rather by messing with the API Gateway – the gateway is a little annoying. Those 8-10 steps I mention above are all API Gateway stuff, and what’s worse, they’re the minimum for HTTP-only support. If you want HTTPS (and you should, it being 2016 and all), you need to add that into the mix as well.

This is, of course, the sort of thing that cries out for automation, and various folks are scrambling to fill the void. We already use Terraform for wrangling EC2, and it won’t take much for them to cope with Lambda and the API Gateway. Serverless announced Python support in their recent 0.2.0 release; Zappa’s initial release happened just two days ago. More on those as we experiment – at a first glance, though, none of these tools looks like I can trust it in production yet, so we’re still left with the manual world. (Of course, if you’re working on any of these tools, I’d welcome hearing why I’m wrong here!)

Lambda Is Not Well Documented

“But wait,” I hear you (and all the AWS folks) shout, “you lie! There’re all kinds of docs about Lambda and AWS online!”

Well, let’s imagine that you’re a developer at Alice’s House of Grues. You’ve been tasked with creating the Grue Locator service, which takes the name of a grue and responds with its location. Let’s further imagine that you’re a good developer:

  • You’re a team player, so you and the rest of your team are going to agree on the API before you write code.
  • You’re conscientious, so you’re going to test your code locally before deploying it (and you’re going to automate the tests).
  • You’re a Python developer, so you’re going to write in Python.

We’ll make the API easy: GET /v1/grue/grue_name will give you back some JSON. Done. (We’ll assume that the service just magically knows where the grue is.)

Given the API, you can sit down to write your tests and code locally. First questions: how does the grue_name get passed to your code, and how does output get handed back? Remember: you’re about to write code, so you need specifics that cover normal operation, exceptional conditions, error cases, and corner cases brought on by wrong (or malicious) clients.

Go ahead. Fire up Google. I’ll wait.

Back again? OK. You found a lot of hits, right? Then you had to spend some time differentiating info about AWS Lambda from the Python programming term lambda. Even at that point there’s a lot of chaff, but you I’m sure you eventually got to the AWS Programming Model for Authoring Lambda Functions in Python pages, which dance around the topic and have examples from which you can infer many things, but do not actually provide a specification.

Here’s what the Python programming model actually says about the inputs and outputs:

event – AWS Lambda uses this parameter to pass in event data to the handler. This parameter is usually of the Python dict type. It can also be list, str, int, float, or NoneType type.

The implication here is that it’s expecting JSON over the wire, which will be deserialized into the event parameter. This is further implied by Lung’s directions about mapping templates, which include explicitly setting the input type to application/json and manually constructing a JSON dictionary from the URL query string, but it’s still an implication and not a specification.

This may seem unnecessarily pedantic. Maybe it seems obvious that the handler is going to accept a JSON dictionary and deserialize it, so they shouldn’t need to spell it out. But no, this really is important: if they don’t spell out what’s valid input, they’re not telling you what to expect when – not if – the user hands in something unexpected.

Suppose the user hands in a simple string instead of a JSON-encoded dict? Suppose they hand in a JSON-encoded array? Suppose they hand in gibberish? We’ll talk about this more shortly when we get to error handling, but the problem is that you really have no idea. Instead you have to guess and test.

Likewise, here’s what they say about outputs:

Optionally, the handler can return a value. What happens to the returned value depends on the invocation type you use when invoking the Lambda function:

  • If you use the RequestResponse invocation type (synchronous execution), AWS Lambda returns the result of the Python function call to the client invoking the Lambda function (in the HTTP response to the invocation request, serialized into JSON). For example, AWS Lambda console uses the RequestResponse invocation type, so when you test invoke the function using the console, the console will display the returned value.
  • If the handler does not return anything, AWS Lambda returns null.
  • If you use the Event invocation type (asynchronous execution), the value is discarded.

This is, well, better. It actually says that it’s going to take the output and serialize it into JSON. Great. The main issue here is that we have to go somewhere else to find out how to control the invocation type (the tutorial doesn’t even mention it); at least it explicitly says that the test console uses the RequestResponse version. But again, suppose you return malformed data? If you return a custom object, how does the serialization happen? What happens if you return None? What happens if your handler raises an exception? Again, without a spec it’s a guess-and-test game.

Note that AWS Lambda is hardly alone here: poor docs are endemic in the industry, enough so that I’ll be writing further about it later. But it’s definitely an issue with trying to get started with Lambda.

Lambda Is Terrible at Errors

We mentioned a few specific cases above but let’s get more into this, because it’s a dealbreaker. How, exactly, are you meant to manage errors in Lambda?

Bear in mind that there are a lot of potential points of error in Lambda. The user can call you with bad data, your own service’s processing can fail, the network can fail, the list goes on and on… and, again, Amazon never documents how exactly you’re meant to deal with this.

Walking down some of the simple cases:

The user can give you unparseable input.

This is actually sort of simple: AWS will simply not run your Lambda if it can’t understand the input at all. Of course, the logging AWS provides (AWS CloudTrail) often didn’t actually capture anything about this case, when I was trying it, so it was a little tough to be sure exactly what was happening.

The user can give you input that’s parseable but illegal.

This is the case where the user gives you valid JSON, but it doesn’t make any sense (maybe you have a required dictionary element that they didn’t provide). Lambda doesn’t seem to have any specific way to handle this, so we have to treat it as a generic error at runtime, which leads us to…

Something goes wrong at runtime.

Maybe another service that we need is down. Maybe the user made a nonsensical request. We’d like to log this for debugging, and return an error to the caller so that they know it didn’t work. Logging should be easy: per AWS, any logging we do from Python should appear in CloudTrail, as should writes to stdout. Sadly, I often seemed to be missing output when I was experimenting, which complicated life.

Worse, there doesn’t seem to be a way to get Lambda in Python to return anything but HTTP 200. If you’re writing in JavaScript, maybe… so why not Python? (I’d love to be wrong about this, by the way.)

The combination means that the most effective tool here is to always return a dictionary with an element indicating whether the request succeeded or failed, since you don’t have HTTP status codes to work with.

Finally:

The runtime raises an exception.

AWS claims that if a Lambda raises an exception, a specific JSON dictionary is returned. This pretty much seemed to work, when I did it, but of course it means that you get something totally unlike the output you see in the normal case. Combined with the lack of control over HTTP status codes, wrapping your entire service in a big try/catch block seems critical to have any control at all.

Tools. Not Just Building Blocks

When I first sat down to write my microservice using Lambda, I really wanted it to be the greatest thing since sliced bread. It had so very much promise, and I just loved the idea that I’d be able to whip up 50 lines of Python and let Amazon worry about deploying.

Sadly, it was too good to be true. We recently rolled out that microservice using Terraform and EC2 instances because Lambda just isn’t quite ready for the real world. Maybe things will be different soon, though. Maybe Zappa, Serverless, and Terraform will take over the world. Who knows, maybe Amazon will decide that Lambda should be the first thing to move beyond the building-block stage. I’ll be watching closely: in the meantime, by all means hit me up at @_flynn if you think I missed something.

  • (sorry this is long, but I feel important to consider – not that I didn’t encounter many of the same things you wrote about initially)

    Eh…I think you should spend some more time with Lambda before dismissing it. Some of these pain points you’re seeing are easy to avoid and you won’t even think about them later on. The observations here are understandable though, Amazon’s documentation does leave something to be desired! You are completely on the mark there.

    First, HTTP? o_O How are you even able to make HTTP requests via API Gateway? It shows you an HTTPS URL for your API by default and then there’s their new cert manager (free). HTTPS is not an issue in AWS land. In fact, I’m not even sure how you’re accessing the API via HTTP…Because mine automatically redirects to HTTPS.

    You shouldn’t let your user just hand anything to your Lambda either. API Gateway helps you here as does your application code (client facing stuff). So you shouldn’t end up with “jibberish” being passed. Even if so, you can catch these things – but that’s up to you to code in validation for your own specific use case. It’s always been up to the developer to handle (nginx doesn’t do it for you for example), so that shouldn’t discount something from the “prime time.” If you liken it to a framework that protects you from SQL injection, then I can see where your desire is coming from, but I don’t think Lambda should be considered a “framework.”

    Lambda is fine for errors. I’m sure there’s room for improvement, but it’s serviceable today. Everything logged out ends up in CloudWatch. You can set alarms here and trigger even more Lambdas based on things here. So sure, there’s a time investment…But it’s all there. The items you’re rattling off of what could go wrong is completely up to the developer to handle. Lambda doesn’t know about your other services or overall application state nor should it necessarily know these things. Again, unless you created one to specifically consider those concerns.

    When you talk about status codes of 200. Are you talking about API Gateway? You need to configure other responses. Are you talking about AWS SDK when invoking Lambda’s manually? They do indeed have a variety of StatusCode values on the response object.

    When you say “not ready for the prime time” I think that’s a bit misleading because of the context. What are the expectations at play? It’s a great clickbait headline, but I also feel a little irresponsible at the same time. People who fail to read the article and then don’t bother to try Lambda and don’t fully understand it’s capabilities (let alone are well versed in microservices) might end up dismissing Lambda when it could have actually really helped them out. All because they were told by a headline not to do it.

    I get it. Given the site here. Datawire has its own software and agenda. Pushing people away from Lambda may be the goal. I don’t know…But I think there’s some bigger concerns here with Lambda (I’ll list below). None of which make it “not ready” though.

    Let’s face it, ALL of Amazon’s services are configuration monsters. They are all building blocks too. Often poorly documented (not because of lack of words, but perhaps circular documentation and confusing terminology). That’s just how AWS has been. It takes time to wrap your head around their services sometimes. It also takes a good deal of time to understand how, why, and when to use those services. Don’t expect to see all angles from a hello world tutorial.

    Do I think Lambda is perfect? No. Do I think it’s production ready? Absolutely. I use it in production. Where do I think Lambda has some things it could improve on?

    – Max. execution time. I quickly had to move my code over to ECS Tasks to get around this. Fortunately I could re-use the same exact Lambda code though. A strength of AWS services and Lambda at the same time. But yea, max execution time should be much higher.

    – Support of Languages. Most of the languages Lambda supports are a little slow in my opinion. I wish they natively supported Go. You can have Node.js or Python call a Go binary of course, but it’s not the same. However, when you use Go you end up with many things executing under 100ms (rounded up to 100ms for billing). Not so with Node.js. In fact, almost nothing you do with Node.js will be that fast. This is important because it affects your AWS bill.

    – Tools. You’re absolutely right here too. They need some tooling to help you make REST APIs more easily. Things like Serverless (formerly JAWS) help a bit. Like you mentioned. Though many of those are undergoing frequent changes. So it’s fair to say those aren’t ready for the prime time, but Lambda is as far as I’ve seen.

    Another tool you might want to check out is Apex. https://github.com/apex/apex

    Though again, I don’t think any of this makes Lambda not ready for the prime time. It has been a very effective tool for me in production and has not skipped a beat. It’s highly available. Highly useful. Dependable.

    • – Flynn

      Never apologize for long comments. 🙂

      Thanks for taking the time to lay all this out, and especially for the pointer to Apex – I’m definitely glad to hear from someone using Lambda in production, and I’m curious to hear more: are you using some of the third-party tooling? are you using versioned endpoints? how often do you deploy?

      For us, sure, we could definitely have gotten Lambda to work. However, the headaches around wrangling the different bits manually meant that Lambda would’ve been more costly in terms of devops time than just spinning up an EC2 instance, and the cost savings for using Lambda instead of the always-on EC2 micro instance we’d need for this particular microservice would not have offset that anything like sufficiently. (Note “for this particular microservice”: that tradeoff will very much depend on the application at hand. Also note that we’re not handling EC2 manually – we have a lot of automation that we lean on there, which changes the equation further.)

      On HTTP: hmm. Fascinating. I did all of my testing using HTTP, and we very definitely ran into name mapping/SSL cert hell when we went to get SSL working with a domain name of our own. I’ll take a look here again – if AWS has improved this experience, that would be wonderful news.

      On error handling, both input and output: you’re absolutely correct that nothing will prevent errors, and I don’t mind that Lambda is no exception there. Where Lambda falls down is that Amazon doesn’t even tell you what they’ll handle, what you need to handle, what they’ll do with bad data, etc. That, in turn, means that all of those behaviors can change without warning, rendering your code useless. (This may sound a touch paranoid. Sadly, I’ve lived through exactly this experience many times. 😐 )

      On the HTTP 200: a JavaScript Lambda has methods on the context object that allow for changing the status code, but a Python Lambda does not. Not the end of the world, since the API Gateway does allow you to configure responses – except that no matter what I configured, it only ever returned 200. (No, it didn’t make sense to me either.) Also, even had it worked, that’s piling more and more configuration into the API Gateway, which is more and more configuration we’ll have to duplicate when we add a service, which brings us full circle back to the tools vs. building blocks discussion

      Once again: we could indeed have gotten it to work, and I’m glad to hear that it’s been working out for you. For us, though, the ops pain simple wasn’t worth the potential cost savings.

      Finally, Datawire definitely has no anti-Lambda agenda, since our tools are agnostic to your deployment platform. We’d be delighted to have you use Datawire tools to build a Lambda-based microservice, and in fact we’re keeping a close eye here precisely because we’d be delighted to use Lambda to save ourselves some pain and money. It’s just that right now, they won’t: it’s more practical to use EC2.

      Again, thanks for the thoughts! Off to go look at Apex…

      • Certainly a good point about devops cost and timing. I agree, it took a while to get things setup; longer than I’d like for. Though it’s a learning curve and now we have a good workflow.

        I was using node-lambda (https://www.npmjs.com/package/node-lambda – though I modified it) to run the Lambdas locally and then deploy (using npm commands). We are using Node.js for our Lambdas. I tried Serverless (JAWS at the time) and now I’m into Apex, but I already established a workflow with the simple node-lambda package. Basically, developers here write their Lambda along with test cases and use npm test (and coverage reports with istanbul) and then another NPM command for running the Lambda locally. When everything works, an NPM command for deploying.

        We aren’t making use of versioning really. Not yet anyway. We deploy sparingly, but as much as a few times a day. Deploying is quite quick with these tools. Arguably too easy.

        Bummer to hear Python doesn’t have the same feature parity as Node.js in Lambda land. That, I had no clue about since I don’t use Python (or Java). Interesting.

        I’ve been looking into Datawire a bit and hope to get the chance to use it. Our needs are starting to grow and Lambda does have limitations that force us more down other paths. In some cases. Again, I don’t think the challenges we’ve ran into make Lambda not worthwhile.

        The 5 min. time limit has been a big obstacle, but I recently resolved that with a Docker container that mimics Lambda: https://github.com/tmaiaroto/node-docker-lambda

        Now, I can (and yup, there’s some configuration here too…and by some I do mean a good bit) take my existing Lambdas and wrap them up into a Docker container to launch via ECS using ECS Tasks. Here, we lose the CloudWatching logging (though I can add that back in manually), but gain control over resources to a greater degree than what Lambda allows and we get around the 5 minute execution time limit.

        All that without having to rewrite our Lambdas or change our development process. Build locally, test, and run locally (with the node-lambda Node.js package) or now in the Docker container itself right from our CLI. Then deploy with relative ease (a little more involved than deploying Lambdas, but I could make a Gruntfile or shell script or something). Then it’s ECS runTask() API call instead of a Lambda invoke(). When we call from the AWS SDK. Of course hooking that up to API Gateway would also be possible…But in this specific case I’m doing a lot on schedules and from within the codebase and not from a REST API.

        In fact, I have a scheduler Lambda that picks up jobs from SQS and then calls ECS Tasks to perform the work. Using ECS comes with its own scheduler too of course and then many more features like setting up a bunch of different clusters, etc.

        So at the end of the day, the dev process is the same. The deploy process varies wildly. But ECS + Lambda covers all tasks that we want to run and not worry about in terms of scale. Seconds or less for Lambda. Minutes or more for ECS.

        • – Flynn

          I’m gonna steal your Docker container there. 🙂 Great idea.

          Thanks for the info on your dev flow! It’s not hard to imagine that using Lambda mostly with the AWS SDK and without externally-visible versioning would be a much more pleasant experience than having to wrangle the linkage between the API Gateway and Lambda. Wish we could do it that way. 🙂

          But (as I said in the article): I think that the balance of pain will shift rapidly as the external tooling catches up, and as language support within Lambda gets better (although that’ll probably take longer than the external tooling will!). With any luck, pretty soon I’ll get to write a post about how Lambda has become an incredibly useful tool for us.

          In the meantime, though, ping me on Twitter with any Datawire questions you might have, and we’ll get you set up on that front. We have some pretty interesting stuff coming up in the near future, and I’d definitely like to hear what you think there.

        • Tobin Mori
      • Gordon Shankman

        We’re currently using API Gateway and Lambda and making a switch from Java to Python due to speed issues. I found a very simple workaround for the API Gateway error handling/response code that worked great in Java, but seems to fail in Python due to the way Lambda serializes exceptions.

        API Gateway will only check the Integration Response expressions against the “errorMessage” property in the object generated by Lambda when an exception is raised by Java or Python code. It ignores anything returned by a “successful” call. Our solution for Java was to throw an exception when errors occured, including the desired error code in a response object and serializing that object as JSON in the message of the Exception. We could then regex match each error code to generate an Integration Response mapping to set the error code and use the mapping template $input.path(‘$.errorMessage’) to deserialize our desired response before it returned to the user.

        That approach worked great with Java, but Python fails because the serialized errorMessage from the exception is wrapped in single quotes! The template deserializes it as a String rather than an object. It’s almost there, but I definitely agree with Tom’s point below about the frustrating disparity of functionality between the various supported languages. I also agree that the majority of your complaints are about API Gateway and not Lambda itself. Hopefully, Amazon fixes some of these issues in the near future!

        • – Flynn

          ‘[The API Gateway Integration Response] ignores anything returned by a “successful” response.’

          That’s just amazing… and it explains a lot of headaches I had. Did you find that documented somewhere I missed? or did you learn it empirically?

          Also, I’m now very curious about the Java vs Python performance issues you were having. Thanks!

          • Gordon Shankman

            I think it was a little of both. Spent days trying to figure out why I was getting my desired error objects back, but the response code was still ‘200’. There’s a small hint in the API Gateway docs here (https://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-method-settings-callers-console.html) where they say (emphasis added):


            Tip
            You will use Method Response, to specify all the possible response codes for your API and use Integration Response to tell API Gateway how **backend errors** are mapped to an HTTP status code.

            They never really spell it out though. I think I did finally find it somewhere, but I can’t remember if it was buried in one of their walkthroughs (the API Gateway tutorials use Node, which I believe can set the response code itself) or in a forum like this.

            As far as the performance issues, there are definite delays with both Node and Java Lambdas if there are no deployed instances when a function gets called. I think Node is faster, but we saw the JVM take ~3-4s to initialize. Part of that is our code was written in Groovy, so there was a good bit of overhead from that. Beyond the cold startup issue, we were seeing strange delays accessing network services. We’re using API Gateway to trigger Lambdas that execute calls against an AWS ElasticSearch as a Service cluster. We noticed that our first call to the ES cluster, regardless of whether it was a search, get by ID or a simple HEAD operation to check for object existence was taking 4-5s! Every single one of those operations, when tested outside of Lambda, took <1s and usually was on the order of 10-100ms. Every subsequent call completed in a reasonable amount of time. I'm still not sure what was causing the delay, possibly Java network stack initialization? With the combination of cold startup and the network delay, our API Gateway calls would routinely timeout unless we bumped the Lambda up to a minute or more, which is just unacceptable performance for the types of calls we were making. Switching to Python, the calls return almost immediately — but the error handling is a complete bust at the moment.

          • fightthefud

            What I’ve noticed is that with the Java8 engine for Lambda, when there is an error with the underlying code, it will take much longer to long and return the exception information (stacktrace). However, when everything is working, it seems to work rather quickly. The trouble is that I only got accurate exception information if I bumped up the Lambda timeout to 5 minutes (though I suppose I might be able to get away with less time).

          • Tobin Mori
          • Tobin Mori
          • Tobin Mori
        • disqus_oy6ZkCSOwg

          >making a switch from Java to Python due to speed issues
          I don’t think your performance issues are going to improve.

          • nate

            It depends. If you’re talking cold starts (only running 1 lambda every 30+ minutes) then yes, Python WILL be faster because Java has a lot of start-up overhead. But if you’re invoking your lambda a lot (every couple minutes MAX), then the container can be reused and Java’s cold startup wont be a problem.

          • Tobin Mori
          • Tobin Mori
        • Tobin Mori
      • Tobin Mori
    • Tobin Mori
  • smegman

    Your comment about Lambda automation is… strange.

    AWS CLI tools lets you create and update, for example. Have a look: http://docs.aws.amazon.com/cli/latest/reference/lambda/index.html#cli-aws-lambda

    Not sure how you didn’t find that?

    • – Flynn

      This is coming up enough to make me regret not calling it out specifically in the post. 🙂

      It’s definitely true that I could use aws-cli to script Lambda – after all, aws-cli is pretty much just a command-line interface to the same APIs that Terraform, Serverless, et al use. But given that those guys have already done so much of the work, for me to start from scratch with aws-cli would end up being a lot of work with a very poor return on investment. Much more efficient for me to contribute to their efforts if I can, even if that’s only keeping an eye out and testing new things.

      It’s instructive, actually, to contrast how much work it is to set up EC2 with Terraform, vs writing aws-cli commands directly. One way is much less painful than the other. 🙂

  • Ramon

    Hi Flynn,
    I wanted to say that I feel completly the opposite about Lambda. Not sure that’s due to me being a big NodeJS fan, maybe that’s the case.
    Have you tried NPM Build, Gulp or Grunt Build?
    These tools can help you zip and publish to AWS Lambda and they are very good at helping in the build process. I’m a big fan of NodeJS and now a big fan of AWS Lambda, not sure how you can put your back towards this. I pay only a couple of dollars a month, it’s cheap to be Lambda too! I pay more for Route53 than for the Lambda! Best Regards!

    • – Flynn

      Hi there! I think that, at this point, the Lambda experience probably does vary depending on language. I wouldn’t expect it to vary hugely, though: how often do you deploy? how many endpoints? do you use versioning?

      In my experiments, getting the code to Lambda was never the hard part. Wrangling all the config to keep all the wiring intact was the hassle.

      • Ramon

        That problem is due to the code architecture you have in place. Focus on creating a great build process that will help you on the problems you had.
        There was never a configuration problem, but I never used tools like JAWS or anything like it, I figure they require a lot of configuration to which I was not going to do. I just focused on having my deploy process clean and always optimizing my code, if you focus on that instead of using a range of frameworks that will help you in the build process.
        Not sure how to optimize the build process in Python, but I’m sure there’re tutorials and tools around talking about it.
        Best Regards,

  • NickDA

    @disqus_7Usp7fdOx2:disqus – interestingly, most of your points actually seem to be about API Gateway, rather than Lambda itself, hence the title is misleading. All of the concept of validation, response formatting/returning, etc.

    Also, lambda errors display in CloudWatch logs, quite verbosely actually. (which is fun since it allows a meta lambda to subscribe to those logs and process the outputs if you need for a health dashboard or sorts).

    • – Flynn

      Hang on, I need to finish laughing about the meta-lambda reading lambda output before I can write more… 😉

      I think that you’re correct that the API Gateway is the source of the most pain here, but I think that that pain is made dramatically worse when you’re trying to work with microservices, since there tend to be many microservices and they tend to change more often than many monoliths. I expect that there are other cases in which API Gateway’s pain isn’t as great an issue.

      Next time ’round I think I’ll make sure to explicitly call out the microservice angle in the title — thanks for the feedback!

    • Tobin Mori
  • Alexey

    I also miss code parametrization a lot. I have the same code deployed to different environments (let’s say production, staging etc, does not matter), code base is the same, some parameters are different (For example S3 bucket name). Normally you would pass it as environment variables. How do I deal with in lambda world? Of course I can pass it as lambda input parameter, but that means I need to move this parameters to API Gateway where they don’t belong to by any means.

    • TheMCME

      I believe same as with web servers, 1 Lambda function = 1 environment (1 server = 1 env). Then API Gateway stage variables are similar to Tomcat or Jetty application properties.

  • TheMCME

    The doc is indeed sometimes not clear or confusing but I was able to code a complete RESTful API with 1 Lambda function (Java) and API Gateway (many endpoints, many HTTP methods). I don’t consider myself as a genius so maybe you didn’t spend enough time on it but I am able to handle errors, any kind of input params, headers, and output HTTP responses almost the same way I used to do with webapps deployed on a server.

  • Frederic Rudman

    WOW! SO TRUE!

    Just went through the hell of creating a real lambda function (real, as in we really needed it to work for a real-world system) and it was absolutely atrocious! Everything you wrote (and more!) is completely accurate. We ended up having to write a simple automation build-and-deploy suite just to allow us to start developing our lambdas (auto package the source, upload the zip, publish, …). What a pain.

    The good news (for your readers) is that if you persevere and reach the end-point (not a trivial task) you can start appreciating the likely future of aws lambdas once all this ‘chaff’ gets cleaned up. AWS services always seem to start particularly raw but improve over time. Hopefully same will be true here (esp. a simpler/faster connection for micro-http lambdas: I agree there as well, that’s the real gold mine)

    [I should add: our env is php at the server (our build-deply env) and Node.js for our lambdas]

    My 2 cents…

  • Mark Betz

    I think the only one of your points that really holds up is the documentation complaint, and that could be said about pretty much all the fast-moving cloud services. I’ve done extensive work on both AWS and Google Cloud and they both suffer from that issue.

    On the absurdly manual nature of assembling a solution in AWS lambda I agree. However you’re obviously interacting with the console, and that is always absurdly manual. If they gave you higher level abstractions that reduced the number of steps then you’d probably run into situations where you could not do what you needed to. The console is absurdly manual for AWS services, and for Google Cloud services, and for the hoster that runs my blog. It’s also evil. It’s imperative, not reproducible, and even the person who did the clicking and typing is unlikely to remember all the steps later. Best thing to do is use the console to figure out the general way things fit together, and then use make, puppet, chef, ansible, CloudFormation, etc. to automate the steps. That’s the only way to get to reproducible deployments.

    On the error handling: at least here the docs are _fairly_ clear. If you raise an unhandled exception you get a 400 error with the exception content in a little json envelope. However, the primary use case for lambda-based services is with an entrypoint through API Gateway, and you have quite a bit more flexibility there. You define the status codes your endpoint can return in the Method Response part of the Method Execution flow, and then in the Integration Response part you map those error codes to regex matches on the results returned from lambda. So all you have to do is return descriptive strings from your lambda function and then you can use the regex to detect them, set the appropriate status code on the response, and include the string as a json body if you want. It’s definitely a phase transition, and there is the potential to lose information, which brings me to..

    The integration with Cloudwatch logging. In python anyway it is as simple as getting a logger from the logging package, setting its level and writing to it. If you have cloudwatch enabled for the function then the output shows up in a clearly labelled stream. Ditto for the API Gateway endpoint, which is really nice when you mess things up there and aren’t even getting the call through to the function. For some reason the API Gateway creates exactly 300 log streams when you enable Cloudwatch on it, which is bizarre, but the ones with activity show up at the top and you can ignore the others.

    • Mark Betz

      In my comment on error handling I mis-understood the mechanism API Gateway supports. You do need to raise an exception from the lambda function, since it is the lambda error header that the gateway regex matches on. Otherwise it works as described, and it’s fairly easy to set it up to return the status code you want, along with whatever parts of the error are relevant (I just return $input.path(‘$.errorMessage’)).

  • Gary

    You missed the elephant in the room: it uses a pretty old version of Python. Python 3 was released in 2008, and Lambda doesn’t yet support it.

    For someone who is a Python developer today, maintaining an up-to-date Python application, this means I can’t run any of my existing Python code in Lambda. There’s so much that’s nicer and easier to work with (and less buggy) in Python 3.5 that I can’t imagine stepping that far into the past.

    When Python 3 was released, the current version of IE was IE7. Can you imagine showing a web developer a new service that only supported IE7?

    • vlcinsky

      I partially disagree.
      You are right, that we have python 3.5 as the latest version, but the Python 2.7 is also updated and is definitely not comparable to status of IE7 as the latest 2.7.12 was released 2 month ago.
      I would too appreciate support of Python 3.5 but I can understand, why providers like Google with App Engine and AWS with Lambda stick to version 2.7 – in long term it is the most stable platform for Python apps. Python 3.x is evolving in time and gets minor version once in around 1 and half a year. Each minor version gets new features with each version and as soon as it gets out, someone would call for AWS Lambda to update it. This would not happen with Python 2.7.

      Anyway, I would appreciate Python 3.5 support too.

    • Tobin Mori
  • sinzone

    Seems like most of the article is about the aws gateway. But did you try Lambda + KONG? (https://getkong.org). Kong is definetely way more mature than its AWS counterpart, plus is based on nginx and open source.

    • – Flynn

      I have not tried Kong. Another one for the list, thanks!

  • Here’s the setup I’m hoping to leverage lambda for: light workers. I have kue.js/redis for submitting jobs to, and creating workers. My subscribed worker listeners will simple trigger lambda.invoke with json packages (no need to call or support http endpoints etc). No need for api gateway either.

    I’m starting with apex.run as a deployment tool and writing/running all tests locally.

    See any big hurdles with that usage plan?

    As an aside, I’ve got a backlogged task to explore serverless, and their moving away from cloudformation shouldn’t be an issue (assume terraform?)

    • Hey Mark, I’m CTO here at Serverless, happy to answer any questions

      • – Flynn

        Hey Florian, nice to see you pop up here! I too have a backlogged Serverless experiment on my list — hoping to get to it this week.

        (Once y’all documented Python support, it started moving up my list… 😉 )

      • Thanks Florian. I’ll seek you out after the worker stuff is all settled and I can revisit our api

      • Cool I just found a few minutes free to start reviewing docs. I’m exploring Apex.run (to support workers), Claudiajs and Serverless (to quickly deploy an api). Do you have any video tutorials handy somewhat like this one for Claudiajs

        ?

  • Tobin Mori

    After setting up several experimental Lambdas, I stopped dead in my tracks when I read this:

    (TLDR – it’s more API Gateway that’s to blame for poor performance than Lambda)
    https://forums.aws.amazon.com/message.jspa?messageID=683544

    I love the idea of Lambda and would love for it to leap past this hurdle so that it can be used in production!

    • – Flynn

      That really is a depressing report.

      I’m still optimistic about the broader ecosystem around, uh, deployment without explicitly provisioning servers (which we could call serverless deployment if that weren’t so ambiguous now, sigh 😉 ). Recent comments here are only making that case more strongly; I’m looking forward to getting into checking out some of them.

      • Tobin Mori

        Indeed. I think Lambda and technologies like it, have a bright future. I do wish Amazon would load test there stuff before publicizing it! I would think that this would be standard procedure before letting the public have at it. Despite that, I hope they get this optimized soon so we can make use of it.

  • Thanks for assembling this post @disqus_7Usp7fdOx2:disqus

    3 days ago I’ve experienced really odd situation: while testing simple node.js microservice api gateway just started failing out of blue, it turned out that underlying lambda function was failing with obscure “Service error.”
    without any logs or context what went wrong. Checked AWS status page for us-east1 and it said lambda was operating normally. So I posted question to forum and got answer from AWS crew that infact about 1h downtime of lambda was real thing!
    https://forums.aws.amazon.com/thread.jspa?messageID=721769

    To me this speaks: this is not production ready service at all.

    Anyway, check out new open source project that tries to streamline node.js microservices deployment to aws lambda with API gateway:
    https://github.com/claudiajs/claudia

  • balls187

    Executing an AWS Lambda method to respond to HTTP Requests is only one use case, which failed because API Gateway is a pain to work with (Truth). I mentioned that pain to several folks on the Lambda team during Last year’s re:Invent, and they were well aware of it.

    While having a fully managed REST API in Lambda is a pretty compelling use case, there are other use cases where Lambda does exceptionally well in production. For example, processing S3 files. We currently use Kinesis Firehose => S3 to store raw API metrics. When a S3 file is written, we have an AWS Lambda that is invoked and processes that data into an ELK box for visualization and other basic analytics.

    Having also dealt with the pain of Lambda+API Gateway, your spot on with your criticisms, but picking on AWS for poor documentation is like picking on a MSFT for building awful software.

    • – Flynn

      Documentation is a recurring problem, industry-wide. I don’t think any of us should get a pass on that front.

      And really, that ties into my whole sense of Lambda. If Amazon didn’t so explicitly call out API Gateway + Lambda as a solution, I’d feel a whole lot better about Lambda.

      • balls187

        I agree on your statement on documentation is an issue, but singling out it as reason that Lambda isn’t ready for general use, is

        Amazon’s approach has been to release early, and iterate, often times only solving a handful of business use cases.

        For example, only recently Lambda supported VPC, and even still it’s half baked–you can use VPC resources, but that will exclude S3 unless you configure your VPC properly.

  • marak

    @disqus_7Usp7fdOx2:disqus Great article, I could not agree more! If you are interested in seeing what a working friendly Lambda might look like, please check out my project https://hook.io

    We’ve pushed over 43 million microservice deployments since our launch. I’d be glad to discuss the project more if you have any questions.

    Drop me a line: hello@marak.com

    • – Flynn

      Will do! I ran across hook.io after writing this, but have been heads-down building other stuff for awhile and haven’t had a chance to experiment yet.

  • John Lamm

    It’s an event based execution engine. The fact that AWS doesn’t hold your hand and serve up a silver platter tool is very smart of them. If they we’re to develop an opinionated “tool”, they would limit their market and the creativity of their users. With a small amount of concerted effort, you can easily string together a few CLI commands and generate your own deployment/update methods as has been said here already. As for the rest of the arguments. You can trigger a Lambda from just about any I/O service they provide on top of API Gateway (Ex: SNS, S3, Kinesis, etc). Inbound message models are well defined. Outbound message models are up to the developers discretion as they should be. And… If you write a descent script and catch your errors properly, notify a service and react? Finally, if dropping your server costs from around $21,000.00/mo to roughly $100.00/mo isn’t enough reason, then you’re following the long history of developers that are stuck in a ditch somewhere loyal to some aging “tool” that helped them not have to work.

    All the love, but I for one just heated up my dinner and parked in it front the TV for some PRIMETIME Lambda baby!

  • JimboJones007

    This was just announced, it might help with the deployments (assuming you use Bitbucket like we do)…

    Lambda deployment looks really clean and simple

    https://aws.amazon.com/blogs/apn/aws-sample-integrations-for-atlassian-bitbucket-pipelines/

  • Martin Streicher

    Does anyone offer an alternative to Lambda? that works for Ruby, say?

    • rayray

      you might checkout iron.io — their ironworker in particular.

  • SteveALee

    Your first 2 points nail the problems I’ve been hitting with cognito authentication with User pools and cognito sync. Arrrrgggggg. Live is too short. I’m going to try Azure App services and functions, even though javascript sync is not ready yet.

  • orubel

    I was invited to talk to Amazon a few months back and mentioned that they were not addressing the architectural cross cutting concern (infoq.com/author/Owen-Rubel). I noted alot of the same issues with Lambdas in the fact that they are self contained and cannot extend classes or libraries and are VERY redundant.

    Also, everything has to go outside the DMZ to be accessed (ie your databases, services, etc) so IO is very slow.

    The manager who I was speaking with told me it was NEVER meant for real API deployment and was only intended for ‘small businesses’ or people who needed a fast endpoint. I told him that was good because that about the only purpose it can serve.

  • orubel

    I was invited to talk to Amazon a few months back and mentioned that they were not addressing the architectural cross cutting concern (infoq.com/author/Owen-Rubel). I noted alot of the same issues with Lambdas in the fact that they are self contained and cannot extend classes or libraries and are VERY redundant.

    Also, everything has to go outside the DMZ to be accessed (ie your databases, services, etc) so IO is very slow.

    The manager who I was speaking with told me it was NEVER meant for real API deployment and was only intended for ‘small businesses’ or people who needed a fast endpoint. I told him that was good because that about the only purpose it can serve.

  • Randomo

    Did you try the CLI and cloudformation for lambda? it is wrong to imply that lambda setup can only be a manual process… I mean, by that logic all command line tools are just ‘building blocks’ since you have to do something every time you use the tool.. unless you have a script or template or something..

  • Ajo Abraham

    @disqus_7Usp7fdOx2:disqus Complete BS. Misleading title as some already pointed out. As you mention many times Lambda is a building block and you have to design around it. That fact doesn’t make it unready for primetime. We successfully launched https://www.lodr.io that uses lambda heavily to process billions of rows. Essentially a serverless Mapper. Using cloudwatch to monitor for errors and dynamo for tracking. At any given time we have 1000’s of lambdas executing.

    Learn more about our architecture here: https://blog.lodr.io/prep-100gb-in-70-seconds-for-redshift-serverless-mapreduce-f5aada98dea7#.jft8pacy9.

    We started with API gateway but quickly moved to a dedicated server for a variety of reasons. Just lambda alone has been amazing. 100GB processed for 64 cents. Unheard of!

  • Ughh…. So much configuration, no defaults, and no tips. I just want to have a functioning contact form : (

  • Black Nerd

    i agree with this article 100 percent…I honestly feel entire open source community is just a bunch of buggy software python, dynamo, lambda, step functions….etc…

  • Kevin Batchelor

    Using serverless and lambda is quite nice. I haven’t seen any issues that this blog rails on, And I’m using it to build a enterprise production app. It is fast easy to use and well documented in my opinion. serverless addresses most of these concerns. http://serverless-stack.com/chapters/add-a-create-note-api.html.

  • orubel

    Amazon and Amazon companies won’t even use their own services like AWS Gateway and Lambdas. They were interviewing me the other day and I asked this and they said they won’t use them. I pointed out the HUGE network latency and they said that might be a big reason but that they can’t comment as to why they don’t use them themselves.

    Having advised their team in the past and talked to the API manager, I had had him state directly to me that they were only ever intended for small to medium sized businesses and researchers; they are NOT intended to scale.

  • It was an excellent article On AWS Lambda to hear from you which is very useful. thank you so much for gathering all this information in one post, it’s very clever and will be extremely helpful for all people.
    https://goo.gl/u7ZPpd

  • Arockiasamy

    It’s a very old post. Not valid anymore.

  • Byte11

    I get where you’re coming from, but two years later, Lambda is a lot more reliable. The documentation is still a pain. It’s not that the docs don’t exist, it’s just that they’re organized in the worst way humanly possible and buried under a lot of garbage. Despite this, it only took me a week to implement most of my server-side application through lambda. It wasn’t too hard at all. Here’s a link that I found very helpful: https://aws.amazon.com/getting-started/projects/build-serverless-web-app-lambda-apigateway-s3-dynamodb-cognito/