AWS Lambda Layers & Async Lambdas: Enhancing Your Serverless Project

Introduction

In this post, I’ll walk you through how I set up AWS Lambda layers for my AI-based recipe generator project, Food Forager. The project uses a mix of modern libraries like the OpenAI API, LangChain, fpdf2, and the markdown library. With API Gateway imposing a 30-second timeout limit on Lambdas, I split the workload across two Lambda functions:

Async Lambda: This function handles incoming requests from the website, stores initial job status in DynamoDB, and asynchronously triggers the heavy processing function.
Processing Lambda: This function performs the actual recipe generation using AI (via LangChain and the OpenAI API) and updates the generation status in DynamoDB.

By splitting responsibilities and using Lambda layers for shared libraries, I not only keep my deployment packages small but also overcome API Gateway’s execution time limits.

Why Use AWS Lambda Layers?

Lambda layers allow you to package libraries separately from your function code. In my project, I have two separate layers:

OpenAI/LangChain Layer: Contains dependencies such as openai, langchain, langchain-openai, and langchain-community along with pydantic-core. This layer keeps the AI-related libraries isolated, and also keeps the layer size just under 250 MB which is the limit for lambda layers.

PDF/Markdown Layer: Contains libraries like fpdf2 and markdown to generate and format PDF recipes. This separation makes it easy to update or reuse these libraries across different Lambda functions without duplicating code.

The Async Lambda: Handling Incoming Requests

Because API Gateway restricts synchronous Lambda executions to 30 seconds, the async Lambda immediately returns a response while delegating heavy lifting to another function. It performs these steps:

Parse the Request: Read ingredients, dietary restrictions, and the requested number of recipes.
Store Initial Status: Insert a record into DynamoDB with a status of “PENDING”.
Asynchronous Invocation: Trigger the processing Lambda with the job details so that the processing happens in the background.

Here’s an excerpt from the async Lambda code:

import boto3
import json
import time
import uuid
import os
import logging
from fpdf import FPDF, HTMLMixin
from markdown import markdown

# Initialize AWS clients and DynamoDB table
dynamodb = boto3.resource('dynamodb')
lambda_client = boto3.client('lambda')
table = dynamodb.Table('recipe_generations')

def respond(status_code, body_dict):
    return {
        "statusCode": status_code,
        "headers": {
            "Content-Type": "application/json",
            "Access-Control-Allow-Origin": "*",
        },
        "body": json.dumps(body_dict),
    }

def initiate_generation(event):
    try:
        body = json.loads(event["body"])
        ingredients = body.get("ingredients", "")
        restrictions = body.get("restrictions", "No dietary restrictions")
        recipe_count = min(max(int(body.get("recipeCount", 2)), 1), 5)
        generation_id = str(uuid.uuid4())
        
        # Store initial status in DynamoDB
        table.put_item(Item={
            'generation_id': generation_id,
            'status': 'PENDING',
            'created_at': int(time.time()),
            'ingredients': ingredients,
            'restrictions': restrictions,
            'recipe_count': recipe_count
        })
        
        # Invoke the processing Lambda asynchronously
        lambda_client.invoke(
            FunctionName=os.environ['PROCESSING_LAMBDA_NAME'],
            InvocationType='Event',
            Payload=json.dumps({
                'generation_id': generation_id,
                'ingredients': ingredients,
                'restrictions': restrictions,
                'recipeCount': recipe_count
            })
        )
        
        return respond(202, {
            "generation_id": generation_id,
            "status": "PENDING",
            "message": "Recipe generation initiated"
        })
            
    except Exception as e:
        logging.error(f"Error in initiate_generation: {str(e)}")
        return respond(500, {"error": str(e)})

In this snippet, you can see that the function immediately returns a 202 Accepted status with a unique generation_id while the processing Lambda is invoked asynchronously.

The Processor Lambda: Offloading Heavy Lifting

The second Lambda function takes over the recipe generation. It updates the status in DynamoDB, generates the recipes using a ChatOpenAI model with LangChain, and finally stores the results back in the database. This approach bypasses the API Gateway timeout since it’s not directly exposed to incoming HTTP requests.

Here’s a key part of the processor Lambda code:

import json
import boto3
import logging
from langchain_openai import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
import re

# Initialize AWS clients
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('recipe_generations')

def lambda_handler(event, context):
    try:
        generation_id = event['generation_id']
        ingredients = event['ingredients']
        restrictions = event['restrictions']
        recipe_count = min(max(int(event.get('recipeCount', 2)), 1), 5)
        
        # Update status to processing
        table.update_item(
            Key={'generation_id': generation_id},
            UpdateExpression="SET #status = :status",
            ExpressionAttributeNames={'#status': 'status'},
            ExpressionAttributeValues={':status': 'PROCESSING'}
        )
        
        # Adjust max_tokens based on recipe count
        max_tokens = 1000 + (recipe_count * 500)
        
        # Generate recipes using ChatOpenAI
        chat = ChatOpenAI(
            model_name="gpt-4o-mini", 
            temperature=0.7,
            max_tokens=max_tokens
        )
        
        system_message = SystemMessagePromptTemplate.from_template("You are a professional chef.")
        human_message = HumanMessagePromptTemplate.from_template(
            """
            The user has the following ingredients:
            {ingredients}

            They have these dietary restrictions or preferences:
            {restrictions}

            Generate {recipe_count} unique recipes. For each:
            - Provide a creative title
            - Write a short introduction
            - List the ingredients with approximate amounts
            - Provide step-by-step instructions

            Separate each recipe with ">>>>".
            """
        )
        
        prompt = ChatPromptTemplate.from_messages([system_message, human_message])
        messages = prompt.format_messages(
            ingredients=ingredients,
            restrictions=restrictions,
            recipe_count=recipe_count
        )
        
        response = chat(messages)
        recipes_data = parse_multiple_recipes(response.content, recipe_count)
        
        # Update DynamoDB with the generated recipes
        table.update_item(
            Key={'generation_id': generation_id},
            UpdateExpression="SET #status = :status, recipes = :recipes",
            ExpressionAttributeNames={'#status': 'status'},
            ExpressionAttributeValues={':status': 'COMPLETED', ':recipes': recipes_data}
        )
        
    except Exception as e:
        logging.error(f"Error processing recipe generation: {str(e)}")
        table.update_item(
            Key={'generation_id': generation_id},
            UpdateExpression="SET #status = :status, #error = :error",
            ExpressionAttributeNames={'#status': 'status', '#error': 'error'},
            ExpressionAttributeValues={':status': 'FAILED', ':error': str(e)}
        )
        raise

def parse_multiple_recipes(text_output: str, recipe_count: int):
    recipes = []
    recipe_chunks = text_output.split(">>>>")
    for i, chunk in enumerate(recipe_chunks[:recipe_count], 1):
        chunk = chunk.strip()
        if not chunk:
            continue
        lines = chunk.splitlines()
        title_line = next((line for line in lines if line.strip()), "Untitled Recipe")
        body_text = "\n".join(lines[lines.index(title_line) + 1:]).strip()
        recipes.append({
            "recipe_number": f"Recipe {i}",
            "title": title_line,
            "text": body_text,
        })
    return recipes[:recipe_count]

This code highlights the workflow: updating the status, generating recipes with a dynamically adjusted token count (to match the number of recipes requested), parsing the results, and storing them back in DynamoDB.

Deploying Lambda Layers

To ensure that both Lambdas have access to all required libraries without bloating the deployment package, I created two separate Lambda layers.

OpenAI & LangChain Layer

Requirements (requirements.txt):

openai
langchain
langchain-openai
langchain-community
pydantic-core

PDF & Markdown Layer

Requirements (requirements.txt):

fpdf2
markdown

By attaching these layers to your Lambda functions, you streamline dependency management and make updates easier. Each function loads only what it needs, keeping cold start times low and deployments nimble.

Final Thoughts

Using multiple Lambdas with dedicated layers is an effective strategy to bypass API Gateway’s 30-second timeout limit and manage dependencies in a clean, modular fashion. In this project, the async Lambda quickly responds to incoming requests while delegating heavy processing to another function. With layers managing external libraries, the setup remains clean and scalable.

I hope this breakdown helps you understand how to architect serverless applications that require complex dependencies and long-running processes. Thanks for reading!

Building an Asynchronous Image Classification Pipeline with Three AWS Lambdas

Machine Learning from Scratch: Linear Regression