AWS Lambda durable functions enhance the Lambda programming model, allowing developers to create resilient multi-step applications and workflows. These functions maintain progress through interruptions and can pause for up to a year, making them ideal for scenarios requiring human approvals or scheduled delays without incurring compute costs.
This article delves into a fraud detection system utilizing durable functions, emphasizing best practices applicable to various production workflows, including approval processes and data pipelines. You will learn how to manage concurrent notifications, await customer responses, and recover from failures effectively. If you're new to durable functions, consider starting with the Introduction to Durable Functions blog post.
Fraud Detection Workflow Overview
In a typical credit card fraud detection system, an AI agent evaluates incoming transactions and assigns risk scores. For cases deemed medium-risk, human approval is required before proceeding with the transaction. The workflow diverges based on the assessed risk level.
Managing Delays and Idempotency
Human-in-the-loop workflows can lead to response times ranging from minutes to hours. This necessitates a durable state preservation mechanism that avoids compute charges during waiting periods. Implementing idempotency is crucial to prevent duplicate actions, especially in financial systems. Developers often use polling patterns with external state stores like Amazon DynamoDB or Amazon S3 to manage these challenges.
Checkpointing and Durable Execution
Lambda durable functions introduce a reliable execution model that utilizes checkpoints to save progress. This allows for recovery from failures or resumption after waiting, eliminating the need to incur compute charges during idle periods. For a comprehensive implementation of durable functions in fraud detection, refer to the GitHub repository, which includes deployment instructions and sample data.
Designing Idempotent Steps
Durable functions operate on an at-least-once execution principle, meaning each step is executed at least once, and possibly more in case of failures. To prevent duplicate actions, two strategies can be employed:
- Strategy A: External API Idempotency Keys - Protects against duplicate charges through external API configurations.
- Strategy B: At-Most-Once Semantics - Ensures that a step executes zero or one time, preventing re-execution on retries.
Handling Concurrent Executions
To manage concurrent executions effectively, durable functions provide the DurableExecutionName feature, ensuring that only one execution occurs per unique name. This is particularly useful when dealing with duplicate messages or user actions.
Timeout Management
Lambda synchronous invocations are limited to a 15-minute timeout, while durable functions can run for up to one year when invoked asynchronously. Proper configuration of timeouts is essential to prevent workflows from exceeding expected durations.
Implementing Parallel Workflows
Durable functions simplify the implementation of parallel workflows using context.parallel(), allowing branches to execute concurrently while maintaining durable checkpoints. This approach ensures that the state is preserved even amidst retries or failures.
Conclusion
By integrating these best practices, the fraud detection workflow can effectively manage customer notifications and approvals while preserving business logic and minimizing compute costs. Deploy the fraud detection workflow from our GitHub repository to explore human-in-the-loop patterns in your own environment.